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PREFACE 

This report assesses what is known at present about the deter- 
minants of educational effectiveness. The work was initially sponsored 
by the President's Commission on School Finance as part of its inquiry 
into alternative financial arrangements for primary and secondary educa- 
tion. One policy question that arose early in the Commission's work 
was, finance for what? Do the resources, processes, and organizations 
now being employed in primary and secondary education have an apprecia- 
ble impact on student achievement, defined broadly? To answer this 
question, the Commission sponsored a small interdisciplinary study at 
The Rand Corporation beginning in January 1971. Because of the potential 
interest of the work, Rand supplemented Commission funding with its own 
corporate research funds . This report presents the preliminary results 
of that analytical effort. It represents, in the authors' view, a first 
step toward increasing the potential effectiveness of interdisciplinary 
research in education. 

Answering the question posed by the Commission reqtiired an examina- 
tion of many strands of research. In terms of traditional disciplines, 
research on educational effectiveness covers political science, economics, 
econometrics, psycljology, psychometrics, sociology, and sociometrico , as 
well as the discipliv\e of education proper. Because our Inquiry was 
concerned with implications of research for policy, the analysis has 
been organized not according to disciplines, but according to questions 
about educational effectiveness and methods used to get results. Tlie 
authors set forth the assumptions tmderlylng each approach, giving the 
reader some sense of what he should look for when he encounters research 
claims about the effectiveness of educational instruments. The latter 
is particularly important, because it is impossible to cover every single 
study, and new results appear incessantly. 

In addition, the authors give recommendations for future research . 
These recommendations, which were requested by the Commission, are 
found in the Summary. 
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The report is organized according to the five approaches to educa- 
tional effectiveness discussed in the introductory section. The reader 
who wants details on the findings of individual studies should use 
Appendix A and the Bibliography. 

The authors believe that this assessment will be useful not only 
to the President's Commission but also to educational researchers, 
policymakers, and laymen. On important issues of substance and on 
the design of future research, the authors have in most cases adopted 
a definite point of view — have taken a stand. 

The authors wish to express their gratitude for comments received 
on an early draft to Professors Richard Snow and Henry M. Levin of 
Stanford University; Professor Alex M. Mood of the University of 
California, Irvine; Joseph C. Kennedy and S. L. Sklar of the President’s 
Commission on School Finance; and Stephen M. Barro of the Rand staff. 
This report was edited by Helen Turin and typed under a demanding 
schedule by Kathy Hunt, Patty Mickelsen, Linda Taft, and Ruby Ueda. 



\ 



5 



SUMMARY 



The President's Commission on School Finance is charged with the 
responsibility of making recommendations to the President regarding the 
role of the federal government in the finance of elementary and secondary 
education. The Commission wished to make its recommendations in the 
light of the knovT ledge accumulated by educational researchers. How- 
ever, every year literally thousands of educational research efforts 
are reported, voany of them \ising very sophisticated analytical techniques. 
Moreover, the results of various studies are often conflicting or incon- 
sistent. The Commission requested The Rand Corporation to analyze and 
summarize the relevant parts of thd.s vast body of data. 

OBJECTIVES AND METHOD 

The objective of our study was to assess the current state of knowl- 
edge regarding the determinants of educational effectiveness. To this 
end, we conducted a critical survey of educational research. The word 
"critical" emphasizes the most Important aspect of our efforts. We have 
attempted throughout our analysis to examine the validity and credibility 
of research results. In the case of each research effort that we reviewed 
we tried to discover whether the researcher pursued proper methods for the 
questions asked (Internal validity), and, if so, were the results credible 
in the light of accumulated knowledge (inter-study consistency)? Our study 
then, is not a classical survey of research listing findings without much 
evaluation of the results; rather, it is our answer to the question, "What 
does the research tell us about educational effectiveness?" 

FIVE RESEARCH APPROACHES 

The body of research on educational effectiveness is very large. We 
found it useful to organize our analysis according to basic research 
approaches used by researchers -- that is, according to the aspect of 
education being studied, the question being asked, and the methods deemed 
appropriate to answer that question. We identified five basic approaches 
used in educational research; the input-output, the procer?s the organiza- 
tional, the evaluation, and the experiential approaches. 
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The Input-output approach assumes that students' educational out- 
comes are determined by the quantities and qualities of the educational 
resources they receive. The iSqual Educational Opportunity Survey — 
known as the Coleman Report after Its principal author, James Coleman — 
Is the most well-known example of this, the educational economist's, 
approach to educational research. 

The process approach Includes most of the work done by educational 
psychologists, as well as certain studies by sociologists and clinical 
and experimental psychologists. These studies attempt to examine the 
processes and methods by which resources are applied to students. 

The organl zatlonal approach consists of case studies of school 
systems that assume what Is done In the school is not the result of 
a rational search for effective Inputs or processes, but Is a reflection 
of history, social demands, and organizational change and rigidity. 

These studies are typically done by political scientists or sociologists 
and focus on the ways in whidi the factors that Influence or impinge on 
the various decisionmakers in the school system affect the behavior of 
the system. 

Studies of relatively large-scale interventions in school systems 
are Included In the evaluation approach. Examples Include the evalua- 
tions of compensatory education programs for the disadvantaged, funded 
by Title I of the Elementary and Secondary Education Act (1965) , and 
the evaluations of Head Start Programs. The central Issue In these 
studies is whether broad-based Interventions affect students’ outcomes. 

Finally, we Include In the experiential approach the so-called 
"reform" literature. These are books and articles, typically written 
by teachers or advocates of educational reform,, that describe how the 
school system works and what It does to those on the Inside, particularly 
students. They share the view that what happens to the student In school 
is an end In Itself, rather than a means toward some further end, such 
as the acquisition of specific skills. 

Previous analysis has covered the following ground. The input- 
output approach has been reviewed twice by other analysts. Each of 
these reviews contains substantive errors; each of them Is Incomplete. 

7 
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The process approach contains many excellent review articles, but they 
tend to focus on relatively narrow Issues, The only survey that covers 
the entire spectrum of studies In this area Is The Handbook of Research 
on Teaching — an encyclopedic work that summarizes research efforts but 
offers no overall conclusions about educational effectiveness . To our 
knowledge, there has been no previous attempt to make a systematic 
assessment of the results of studies In the organizational approach as 
they relate to educational effectiveness. Evaluations of Interventions 
In school systems have been collected, but they have tended to focus on 
the efficacy of one or another particular program and not upon obtaining 
generalized Information as to what has been proven to be effective and 
what has not been effective. The experiential approach finally. Is not 
dven generally recognized as being an area of research. Although Indi- 
vidual books have been reviewed, ours Is one o£ the; first attempts to bring, 
together the results of the many studies of this sort. 

PROCEDURE 

The formal procedure we used In our analysis Is outlined In Chart 
1. We examined Individual studies In each approach and attempted to 
determine whether they were Internally valid. Did the researcher use 
methods appropriate to the problem he addressed? Did he Interpret his 
results correctly In view of the advantages and limitations of the 
analytical techniques he \iscd? We discarded those studies that did not 
satisfy minimum requirements of internal validity. We also made the 
maximum possible use of previous reviews. However, for particularly 
Important studies we returned to the original source, even when the 
results of these studies were already Included In one or more review 
articles. 

The next step was to bring together the results of the individual 
studies and of the previous reviews. We attempted to derive general 
conclusions as to what were the overall results of the many research 
efforts. Our primary criterion was Inter-study consistency. Did the 
results tend to support or reinforce one another? Or did we find that 
roughly similar studies, asking basically the same question and using 
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basically the same methods, yielded substantively different results? 

This procedure was followed for each of the five approaches . 

* 

Finally, we combined these five sets of results to derive overall 
conclusions as to what is now known about educational effectiveness. It 
is from these conclusions that we drew our policy implications. 

LIMITATIONS OF AVAILABLE RESEARCH 

Before presenting our conclusions, we must emphasize that in 
assessing the results of research on educational effectiveness, we dis- 
covered that the research done thus far is subject to many limitations. 

The results of educational research can be properly assessed only with 
these limitations clearly in mind. Each approach is subject to analyti- 
cal problems peculiar to its commonly used techniques. More impox’tant, 
four substantive problems are encountered in virtually every area of 
educational research. 

First, the data used by researchers are, at best, crude meas tires 
of what is really happening . Education is an extremely complex and 
subtle phenomenon. Researchers in education are plagued by the virtual 
impossibility of measuring those aspects of education they wish to study. 
For example, a student's cognitive achievement is typically measured by 
his score on a standardized achievement test, despite the many serious 
problems involved in interpreting such scores. 

Second, educational outcomes are almost excliisively measured by 
cogni tive achievement . Although no one would deny that non-coghitive ^ 
outcomes and social outcomes beyond the ini^yidual student level are of 
major importance, research efforts that focus on these, outcomes are sparse 
and largely inconclusive and offer little guidance with respect to whiit 
is effective; In general, then, whenever we refer to "educational out-: 
come" throughout the discussion, we mean the student's cognitive abilit;/ 
as measured by standardized achievement ' 

Third, there is virtually no examination, of the cost implications 
of research results . This makes; it very difficult to translate research 
.'results; into pplicy-releyaht . statements,., 
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F^wa^^y^ few latudles maintain adequate controls over what actually 
goes on In the cl^sroom as *lt relates to achievement . Thus, researchers* 
data may well be affected by circumstances unrecognized in their analyses. 
For example, it is not unusual to find a researcher comparing the rela- 
tive effectiveness of instructional methods A and B. He night train one 
group of teachers in the use of method A and another in the use of method 
B, and at some later point, he would measure and compare the cognitive 
skills of the students who were taught by teachers in the two groups. 

The validity of the results generated in such a study would, of course, 
depend, zunohg other things, iq>on whether the teachers did in fact use 
methods A or B in their classrooms. 

WHERE WE ARE NOW 

With the limitations of research clearly in mind, we return to the 
basic issue of educational effectiveness. The current status of research 
in this area can be described by the following propositions; 

Proposition 1 ; Beeeapcsh has not identified a variant of 
the exie ting syetetn that is consistently velated to students' 
eduoationaZ outaomes. 

Proposition 2: Research suggests idiot the larger the sdioot 

system^ the less likely it is to display innovation^ respon- 
siveness^ and adaptation and idle more likely it t-s to depend 
upon exogenous shocks to the system. 

Proposition 3: Research tentatively suggests that improvement 
in student~outoomeSt cognitive and non-cognitive^ m^ rec^re 
sweeping '■ dhonges in the drganirzation^ struoturey and conauct 
of educational experience. 

In Proposition 1, the phrase "a variant of the existing system" is 
used to describe a broad range of alternative interventions in the 
existing system. We include changes in school resources, processes i 
organizatioh, and aggregate levels of funding. 

We must emphasize that we are not 

difference, or that nothing **works. " Rather, we ar e saying that xes ear 
has forad not hing that cCTasis tehtlv and unanibigubuaiy m^ 
in Student outcomes . The literature contains numerous examples of ^ 
educational practices that do seem to have significantly affected 

■ ■■■■■ V 
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student outcomes. The problem is that other studies, similar in approach 
and method, find the same educational practice to be ineffective; and we 
have no clear idea of why this discrepancy exists. In short, research 
has not discovered any educational practice (or set of practices) that 
offers a high probability of success over time and place. 

We must also emphasize that we are not saying that school does not 
affect students' outcomes. Our only knowledge of what American students' 
outcomes would be were they not to attend school at all is on the basis 
of isolated and unrepresentative examples. Educational research focuses 
on variants of the existing system and tells us nothing about where we 
might be in the absence of the system. 

We can view ourselves figuratively as being in a "flat" area. 

Movements in various directions from our current position do not seem 
to affect our altitude. Furthermore, we do not know whether this flat 
spot is at the bottom of a well, on a broad plain, or atop a tall plateau. 

The research contains some evidence supporting Proposition 2, leading 
to the conclusion that large syst ems are less likely to be innovativ e, 
responsive, or adaptive than are small systems. Further, whatever the 
size of the system, real innovation is apt to come from outside pressures , 
from the community or from the federal government, rather than from within. 
However, since relatively little research has been directed toward these 
issues, this finding must be viewed as tevitative. 

The evidence in support of Proposition 3 comes from two sources: the 

negative results found under the first four approaches , and the descrip- 
tive research discussed under the experiential approach. It should be 
pointed out, however, that the experiential approach offers little in the 
way of strong generalizable evidence to support any parti cular prescription 
for solving "the crisis in the classroom." Therefore, Proposition 3^^ ^^s^ 
be regarded as a tentative inference only. ^ i 

WHERE THIS IS LEADING US 

Thd' findings discus^b^abovd imply that res^^ 

an approach to education that offers substantial promise of significant ^ 
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Improvement in educational outcomes across the board. They raise an ob- 
vious question: Where do we go from here? Three important hypotheses 

are suggested by the research. 

First, there is considerable evidence that "non-school" factors may 
be more important determinants of educational outcomes than are "school" 
factors. There is good reason to ask whether our educational problems 
are, in fact, school problems. The most profitable line of attack on 
educational problems njay not, after all, be through the schools. 

Second, there is some (weak) evidence that the impact of an educa- 
tional practice may be conditional on other aspects of the situation. 

Slmp3.y stated, this hypothesis argues that teacher, student. Instructional 
method, and, perhaps, other aspects of the educational process Interact 
with each other. Thus, a teacher who works well (is effective) with 
one type of student using one method might be ineffective when working 
with another student having different characteristics, or when using 
another method. The effectiveness of a teacher, or method, or whatever, 
varies from one situation to another. 

Finally, there is a suggestion that substantial improvement in educa- 
tional outcomes can be obtained only through a vastly different form of 
education. Voucher systems, open schools, performance contracting, and 
the like have been suggested. We emphasize, however, that there is little 
research dealing with the effectiveness of these forms of education. And 
there is certainly a possibility that they may be less effective than the 
current system. At this point we can say only that the research has not 
Identified any way of obtaining significant improyements in educational 
outcomes throughout the current systems, in other words. Proposition 3 remains 
largely untested. 

POLICY IMPLICATIONS OF THE RESEARCH 

Our review of the research suggests two major implications for - v 
school finance: ' - 

Proposition 4 ; Inoreaa'ing expenditures on tvadttionat eduod-' 

tianal px^tioea ia not W 
aibstanttaVliji 

and ' 
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Proposltlon 5 ; There seem to he opportimities for s-ignifiaant 
redirections and in some oases reductions in educational expendi- 
tures without deterioration in educational outcomes. 

The first of these follows directly from the previous discussion. 

The second implication is also based on the above discussion, but 
more Indirectly. Researchers have examined many variants of the existing 
educational system. As we indicated above, none of these variants has 
been shown to effect a significant improvement in educational outcomes. 

A fact often overlooked is that none has been shown to degrade outcomes 
significantly either. Consequently, there is a long list of equally 
effective variants of the existing system, and, if these variants are not 
all equally expensive, then by choosing the least expensive we could re- 
duce costs without also reducing effectiveness. One of the major limita- 
tions of educational research, however, is the absence of cost considera- 
tions. The research now available does not indicate which of the apparently 
equally effective variants is least expensive. 

It should go without saying that reductions in cost that impinge 
seriously on the health, safety, or welfare of the student should not be 
tolerated. This study reviews what is known about educational effective- 
ness. Wherever overcrowding, unsanitary conditions, and unsafe physical 
plant exist, reducing expenditures should not be considered, and redirec- 
tion or increase may be in order. 

IMPLICATIONS FOR EDUCATIONAL RESEARCH 

Despite the voltune of educational research that has been conducted, 
there are still many major gaps in our understanding of the educational 
process. We have identified six major Issues toward which we believe 
educational research could profitably be directed. First , research 
must examine the extent to which, and under what conditions, learning 
takes place outside the school. Second , the concept of interactions 
must be more dieolv investigated. Third, the vastly different forms of 
education t^iat have been suggested as alternatives to the present system 
should be investigated. Fourth , we must begin to examine educational 
outcomes over time and on many dimensions. Fifth , the approaches must 
be merged. Each offers insights not available to those who work in the 
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others. Each has blind spots. There have been fvar too few attempts 
to use the strengths of one approach to overcome the weaknesses of 
another. And, sixth , analyses must recognize the cost Implications 
of their results. 

Finally, this work is a beginning of a larger task: the creation 

of models that can respond to the challenge of policy-relevance — • not 
only in the short rtm, but also in considering the long-range alms of 
education in society. 
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1. INTRODUCTION 



This report presents the results of an analysis of the research 
on educational effectiveness. The objective of this analysis was to 
assess the current state of knowledge regarding the determinants of 
educational outcomes. We attempted to accomplish this task by con- 
ducting a critical survey of the research. 




EDUCATIONAL RESEARCH AND EDUCATIONAL POLICY 

Each year literally thousands of educational research efforts are 
reported. New results are constantly being presented. The vast body 
of literature on educational effectiveness should provide a firm 
foundation for the formulation of educational policy. Thus far. It 
has not dene so. 

There are a number of reasons for the gap between educational 
research and educational policy. First, there are many diverse streams 
of educational research. In terms of traditional disciplines, research 
on educational effectiveness appears In economics, econometrics, polit- 
ical science, psychology, psychometrics, sociology, and soclometrlcs , 
as well as the discipline of education proper. Researchers have tended 
to follow relatively narrow, Intra-dlsclpllnary paths. There have been 
few attempts to connect these paths ; nor Is there a clear map down any 
given path. Policymaker and researcher alike, therefore, find It very 
difficult to draw policy Implications from these various disciplines. 

Second, the sheer magnitude of the literature on educational 
effectiveness inakes It virtually Impossible to keep up to date on the 
research being conducted In any one field, let alone to maintain aware- 
ness of what Is being produced across the entire range of educational 

.research..'. -f,-; , 

Third; educational researdi has seld been illicitly policy- 
oriented. A considerable wlt^ 

increasing understand of how, and m conditions learning 

takes plecd‘% But t^ has rarely been framed; In the 
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Fourth, and perhaps most important, the research Is full of con- 
tradictory or inconsistent findings. The policymaker thus finds him- 
self constantly basing his decisions on controversial and disputed 
research results. 

This analysis is directed toward the needs of the educational 
policymaker. We believe that what Is Important for the Inquiry at 
hand Is to extract the policy-relevant findings from the research 
and to derive from them broadly based conclusions as to what we now 
know about educational effectiveness. The analysis is based upon 
comprehensive reviews of the many streams of educational research. 

We have attempted, throughout the analysis, to examine the validity 
and crediblUty of research results . In the case of each research 
effort reviewed we tried to discover whether the study was Internally 
valid (did the researcher pursue proper metliods for the questions he 
addressed?) and If It was, were the results credible In the light of 
accumulated knowledge (were the findings consistent with those of 
other s tudles in the area) ? 

The need for examination of Internal validity is clear. We 
cannot base policy on Incorrect or misleading research results. 
Accordingly, we imist ask whether the results of any particular study 
VTere generated by a proper method of analysis. 

Just as inq)ortant is the Issue of credibility (external validity). 
There Is always some chance that a par^cular variable, or a partlcxilar 
set of variables, that appears to have a significant effect iq>on 
achievement is In fact unrelated to educational outcome. For this 
reason, educational policy cannot rely on the results of any one study. 
Whether studies say anything about actual educational outcome^* depends, 
then, on results that appear consistently throu^out a number of stud- 
ies. If • an educational resource or procedure shews up as important 
In a large manber of studies, then we should have relatively high con- 
fidence in stating that this resource or procedure should be selected 
by policymakers (allowing for the relative cos ts of resources and pro- 
cedures) . The ability of analysis to m^ well-supported 

statements about resources that should be selected by policymakers 
thus depends on evidence of externi^^'yalidity. 

- - - . - - - - — 
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Note that an exaMnatlon of credibility serves three distinct 
purposes: First, It provides a way of summarizing numerous disparate 

studies. Second, It addresses the question of what should be believed 
In the face of Inconsistent or conflicting results. Essentially, we 
resolve such conflicts by "adding up" the evidence on each side of a 
dispute. Third, consideration of external validity enables us to deal 
with the avalanche of research results. No review, this one Included, 
could possibly consider every single educational research study. But 
If a large number of Internally valid studies yield consistent results, 
then one can be fairly sure that any omitted study would not have 
substantively changed one's conclusions. 

What follows, then. Is not a classical review of research, listing 
findings without much evaluation of t(ie results. Rather, It Is our 
answer to the question. What does the research tell lis about educa- 
tional effectiveness? 

Accomplishing our objective required that this vast body of 
literature be organized and evaluated on the basis of some analytical 
structure. Our discussion of the research on educational effectiveness 
Is organized according to five basic research approaches — that Is, 
according to the aspect of education that Is examined In the analysis , 
the question being addressed, and the method deemed appropriate to 
answer the questions . 

FIVE RESEARCH APPROACHES 

The five approaches provide a way of collecting together citudles 
that share a similar focus and purpose and that use similar analytical 
techniques. We can thus Identify the similarities and differences 
among the many streams of educational research. Individual studies or 
groups of similar studies are placed in perspective. Moreover, common 
standards of Internal validity apply to studies within each approach. 
This simplifies tfc .' task of evaluating the results of individual 
research efforts. Finally, because studies In w approach tend to 
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have a common orientation, the relationships among their results are 
more easily observed. 



The Input-Output Approach 

Much of the research produced in the input-output approach has 
been prominent in recent policy debates — for example, the Equality 
of Educational Opportxmltv Survey (Colemaix, 1966) and its various 
re-analyses. Research in this approacti views the school as a black 
box containing students (Fig. 1). Resources are applied to the stu- 
dents in the box, and from this application some output flows. Out- 
put is usually defined in terms of cognitive achievement as measured 
by standardized achievement tests,, Occasionally, studies deal with 
the drop-out rate or the rate at which students go on to college as 
outputs. School resources, or inputs, generally include a broad 
range of factors describing teachers' characteristics (experience 
and verbal ability are two examples) , and physical attributes of the 
school (the number of library books per student, age of building, class 
size, and the like). 



Research is directed toward the question. To what exi.ent are 
variations in educational outcomes due to variations in resource 
levels? Ideally, the research is supposed to Identify the extent to 
which each resource contributes to educational outcomes. Policymakers 



should then be able to identify those resources that are most effective 
and restructure the current xise of resources toward the more effective 
configurations discovered by research. 



The empirical problem is to establish the relation between input 
and output . In practice , s tatis tical analysis is applied to ex post , 



Inputs , , 

(school resources) 



|i Sty dents 




Outputs 

(educotipnol 

outcomes) 



Fig. 1— The input-output approach 
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cross-sectional data, although the desire for longitudinal data is 
often asserted. In other words, the analyst collects a body of data 
at a point in time — usually survey data — applies various statistical 
techniques — usually multiple regression — and tries to make statements 
about the effects of Inputs . 

The confidence we can place in research results depends upon 
"(1) the Internal validity of particular studies — the logic and 
design of the analysis — and (2) the external validity — the con- 
sistency of findings across studies. With respect to internal valid- 
ity one asks. Were the procedures generally accepted for this approach 
carefully followed? And, if so, are the results consistent with the 
imderlying model? For the input-output approach, internal validity 
is measured by tests of significance and goodness of fit.^ External 
validity concerns whether studies say something about the real educa- 
tional world. Do they say something about the schools? Here the test 

is inter-study consistency. Are the resources Identified as effective 

2 

in one study also found to be effective in other studies? 

The Process Approach 

The second approach — education as a process — is based on a 
quite different fundamental assumption about what determines educa- 
tional outcomes (Fig. 2). Here the researcher focuses on the "inside" 
of the box. Resources are assumed to be predetermined or given. What 
matters here are the processes by which the resources are applied to 
the student^ and the response of the students to the processes. If we 
can correctly identify processes of education or learning, they will 

^These terms are defined Velow. See Section III. 

In theory, external validity could rest on acquiring new, un- 
analyzed data bn exactly the variables considered in any given study. 

In practice, this would be very costly. For example, few would now 
advocate a replication of the Coleman survey. - So the test of inter- 
study consistency becomes ad hoc. DpijS^tudles that address the s^e 
question with soniewliat different vari^les and somewhat different data 
suggest that the same- inputs are Important? If so, then those who xise 
the input-output approach stay there is a case that the same kinds of 
resources deterndhe the' isame kinds of outputs . But they can never be 
sure. ■ ■ ;'See'- Section' .III-.- ;v' '-'.i'' 
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Educational 

outcomes 



Fig. 2^— The process approach 



determine the quantities of resources that the schools require. The 
processes of concern can be those connected with teachers, students. 
Instruction, or the interactions among them. Educational outcomes for 
the most part are limited measures of cognitive achievement. In a few 
cases noncognltlve achievement is examlneid. 



In most cases, the main purpose of this approach is to extend 
our knowledge about educational processes. In general, there has been 
mucli less orientation toward concrete policy action ^ong researchers 
here than among those who pursue the Input-putput apprpach. The 
policy applications have so far been secondary. To illustrate, when 
conducting an experiment, psychologists lay great stress pn cxperl-* 
nmntal cpntrpl of confpundlng variables. Sometimes, in order tp mini- 
mize the extent to which a s tudent ' s previous lea.rhlng experiences 
affect the outcome of a^^ experiment, they deliberately excunine leam- 
Ing tasks that are veryunllke the learning tasks encountered in the 
classroom -- memorizing lists of npnsehse syllables, fpr example.; 
Consequently, the results pf the experiment offer little direct ppllcy 
':guldance..; i , 

Research here .tisually consists of small--scale experiments or. : 
varlatipna;' ' tr.eatTC of teni;'per|^^ ; Ih' ' a ■' l^of atd ■■■;.;.The::;:p^ iein' 

thim becomes (me- of ’putting vtogether thele 

bear on, the' same 0O(:ei to see Aether they are conststeht.V^ The ex- 
periments ire collated through re^ew articles varying' greatly in . 
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quallty* Here internal validity depends upon whether the studies have 
proper experimental design^ whether they controlled for everything that 
could confound the results; external validity, again, depends iip^n con- 
slstency among studies. Do the same processes appear to affect academic 
achievement in the same way across a nunher of studies? 

The Organizational Approach 

In the organizational approadi to the issue of educational effective- 
ness what is done in the schools is viewed as being not the result of a 
rational search for effective Inputs or processes but a reflection of 
history, social deman<te , and organizational change and rigidities. In 
Fig. 3 we distort the shape of the "box, ’’ because its structure matters 
here (the school system as a whole). The inputs are the rules, the pro- 
cedures, the Incentives that are set up within the system. The approach 
is more concerned with the people in the system — teachers , administra- 
tors, appointed and elected officials — than tire the previous two 
approaches. . The measure of responsiveness to change Is the ability to 
adapt to a (hanging clientele. The assumption is that responsive schools 
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will deliver satisfactory academic outcomes, but not necessarily the 
maximum feasible outcomes.^ Why? Because in this approach the schools 
have multiple objectives, not just academic outcomes; they do many 
things. And the schools are doing well if they get satisfactory 
achievement along with the other goals that have to be satisfied. 

The perceived crisis of the classroom is caused by an inflexible 
stand in the face of changing demands by students, parents, the 
immediate community, and the government. The purpose, then, is to 
understand the behavior of the whole system and describe the shape 
of the box and how and what happens to the people in it -- not just 
the students, but the teachers, the administrators, and the community 
as a whole . 



Research here primarily uses case-study methods. There are no for- 
mal tests of either internal or external validity; in fact, it is rare 

2 

in these case studies to find much cohcem about such matters. There 
are no statistical tests, almost by definition; inter-study consistency 
is 'hard to determine, since the point to be illustrated rarely recurs . 
Nevertheless, we try to apply "reasonable" criteria of our own to assess 
these studies. 



Although the organ! z at iohal approach is relatively undeveloped 
(as compared with the previous two approaches), we believe that it is 
closely related to schools' finances. The leverage of alternative 
financial schemes seems greater on organizational structures th^ it 
does on resources or on processes . It is hard to see how overall 
financial schemes could be ti«^ to the internal use of resources or ■ ' 
processes of school systems without creating massive problems of 
administration and control, it is possible that alternative financial 
schemes , if they can be found, could affect the shape of our educational 
box to make it more receptive to effective resources or processes. 






e emphasize that this is an assumption. We are aware of no empirl 



cal evidence that students ' outcomes are related to the responsiveness 

reas pnab le thing . to b elieve . 



of 



their school. 
2 



seem a 

• ■ ■ , r - 



\\ 



It does, however. 

Although case studies flourish in educational research’s else- 
ere, hvaluatlcms of the methods are vei^ difficult to find. L But; see 
Bock ,(1962) . ' '■ ' 
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The Evaluation Approach 

Studies within the evaluation approach attempt to analyze the 
effectiveness of broad educational Interventions that are directly 
related to large Issues of social policy. Essentially these are 
analyses of programs in which treatments are devoted to "groups of 
children as a whole in diverse programs, taken as a whole" (Steams, 
1971a, p. 6). In such Interventions the resources devoted to each child 
are increased substantially. Since any number of educational Inputs 
are changed at the same time, it is difficult to tell precisely which 
program features are responsible, even where there is demonstrated 
success . Researchers using this method tend to address the question. 

To what extent did a generalized Intervention affect educational 
out comes ? 

Research focuses on school systems in which there have been large 
scale interventions (Fig. 4). The primary concern is to identify the 
relationship between the existence (or magnitude) of an intervention 
and educational outcomes. It should be noted that these sinalyses 
seldom attempt to determine why or how an intervention affected outcomes 
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This contrasts to the other approaches In which the analyst focuses on 
the impact of a particular educational practice. 

These studies tend to be more policy oriented than those included 
in any other approach. Their general purpose or goal is to discover 
what "works." The implication is that if we can discover what "works," 
then we can replicate the intervention elsewhere. 

The analytical technique used to discover whether an Intervention 
was successful is ex post examination of the outcomes of students iipon 
whom the intervention was focused. The evaluator typically attempts to 
Identify a group of students who, although not themselves targets of 
the intervention, resemble the students who were. He then compares the 
outcomes of the target group of students with the outcomes of the non- 
target, or control, group of students. Any differences in outcomes are 
presumed to be reflections of the Intervention ' s impact . 

In evaluations, the researcher usually chooses the members of the 
control group after the program has begun rather than by some ranidom 
process , and there is always the possibility of some systematic differ- 
ence between control group members and target group members. If there 
is, differences in outcomes between the two groups may reflect the 
difference between the groups and not the Impact of the evaluation. 



Accordingly, the question of internal validity hinges on the method by 
which a control group was chosen.^ 









The Experiential Approach 

" The experiential approach is concerned with what happens to students 
in schools as an end in Itself. The school is viewed as an Institution 
containing, and having an Impact on, students (Fig. 5) .It is generally 



■(but not always) acknowledged that the Impact of the school may affect 



;'ti 



educational outcomes. But this is not viewed as being the primary con- 
cern. Rather, considerable importance is placed on that impact as an 



that we do not ask whether a study hM a proper experimental 
design. .If it had, it would have been included In the procests approach. 
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Fig. 5— The experiential approach 




outcome in Itself. The primary emphasis Is on the effects of school 
experiences on students'^ self-concepts and on their relation to other 
people and to social Institutions. 

The purpose of these studies Is to show how the system works and 
Its Impact on those within the system. The central question addressed 
Is, VJhat does ttie sdiool do to students? Hie research Is conducted by 
"on-the-spot" observation. That isi , research reports In this approach 
are frequently provided by participant observers in the form of descrip- 
tions of their experiences. Others were done by people outside the for- 
mal education system who were proponents of education^ reform. 

It Is always difficult to examine the Internal validity of case 
studies, which are often used In ^e experiential literature. And case 
studies by parti clp^t pbservew are usually the most difficult. The 
participant ebseryer reports and Interprets what he has seen, but what 
he has seen Is In large part his own b^avlpr and the response of otiiers 
to his behai^pr . . In fact , one of the^ prestraed advantages of participant 
observations is .^e lMlght obtained by actively e;^ 

being studied.^ objectivity of the researcher beepmes a m issue. 
Further, the majority of studies we reviewed In this approach were not 
conduct<^d by professional resear by , 
persons who entered the system Intending to be tea<^ers but who': were sp , 
Incensed at what they bbseived that they felt compelled' to communicate 
their obset^ to pthersi, Aceprd^^ feelinigs are 

' 'an'^'lmpbrtont -'asp they/';repCrt .' 
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We have atteiipted to examine the internal validity of these reports 
by asking whether they are internally consistent. (Does the author seem 
to interpret what he observes in a consistent manner?) We have also 
tried to discover whether his observations seem to be based on circtim- 
stances peculiar to his situation. (Do his observations concern his 
clctsS) school} or systemj or do his observations concern 
aspects of education in general?) 




We have tried to derive from each work reviewed a set of proposi- 
about the impact of the educational system on students. External 
validity was checked by comparing these sets of propositions to discover 
whicdi seemed to be supported by a number of persons in different cir- 
cumstances. 

SCOPE AND LIMITATIONS OF THE ANALYSIS 



A significant volume of educational research is based on a priori 
reasoning. That is, the researcher begins with some general proposi- 
tions about leamltig that he be 11 eyes to be true. From an analysis of 



these general propositions the researcher derives specific propositions 
regarding the effectiveness of particular educational practices — — ln~ 
structlonal methods or materials, characteristics or skills of teachers, 
and so on. Activities such as this are a vital part of the research 
process; but they are only a part of the researdh process. We know 
yery little about the nature of learning. Our theories and models of 
learning have many gaps and should be regarded as, at best, crude 
approximations to reiility . Accordingly, the specific propositions 
derived from general theories of learning can be viewed only as hypo- 
theses . They may be { true , but it is quite possible that they are 
false. Until they h siibjected to empiricaL t^ th^ must 

be viewed as unprove^^^ we have considered only st^ 

in which some substa;ative, empirlca evidence is presented in support 
of the researcher’s claims. > 



. An education system has many fmctions^^^ many outputs. 
lutputs relate directly to the stud^ti , othe^ hardly, ii^ v - 

ill. For example; the school syst<fem' must interact with the commvmlty- 
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and must provide a number of outcomes relevant to the community. In 
doing so, the school may sometimes act In ways that seem to operate 
against desired outcomes for the student. The school also has a 
political role and must provide outcomes that allow It to compete 
within a political system for power, money, and position. Whatever 
Importance one assigns to political and social functions. It seems 
to us that they are not the school's primary objective, which Is to 
educate students. Throughout this report we focus on research Into 
the determinants of student learning. 

What exactly does student learning mean? The easiest and perhaps 
the first definition that comes to mind Is to Interpret learning as 
the acquisition of knowledge and cognitive skills. In practice, this 
has mainly been reduced to using standardized tests for measuring re- 
tention of specific subject matter; higher cognitive processes (abstract 
reasoning, problem solving, and creative thinking, among others) are 
seldom measured (Klein, 1971). Teacher grades and essay examinations 
are sometimes tised as measures of broad cognltlye abilities, but these 
mecisures are extremely unreliable. Along with the general failure to 
measure cognitive achievement adequately , there Is an almost total 
failure to evaluate and Identify : "noncognltlye achievement."^ Thus, of 
the many and diverse kinds of student learning, almost all of the educa- 
tional /research that examines student learning Is based on a narrow 
'range of cognitive skills as measured by standardized tests. 
i; " ’ 

By and large researchers haye not employed broad measures of 
student learning nor have they resolved the Important problem of 
individual priorities of educational outcomes.;. However, one does, 
find that many of these same researchers who have not been able to 



Ls expression Is used because It Is be, ccmlng vogue In education 
literature, :althoug^^^^ "achievement" Is not the best term to use In this 
regard. It would be more accurate to talk about noncognitlve growth , 
but debate over terms seem relatively tmproductlve as long as It Is 
generally iUnder stood what the term ^ohcognltive achieveiment" means. 

In particular, we Include the concepts traditionally described by the 
term "affective domain" In "nohcdgnltive achievemerit . " 'r ' 
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resolve this problem analytically frequently discuss the importance of 
priorities and individual differences in priorities. It is becoming 
increasingly clear that different educational objectives and values 
exist as well as individual differences in types and levels of ability. 

We must therefore realize that research based on limited measures, 
and accounting for relatively few objectives, cannot lead to con- 
clusive generalizations about educational outcomes . 

In this report we have avoided any explicit discussion of the aims 
of education (although implicit criteria are inevitable whenever effective 
ness is discussed) for two reasons: First, a study of the aims of educa- 

tion was not part of our charter. Nonetheless, certain issues are 
necessarily raised; 

o To what extent should education be an agent of social reform as 
compared with a force for social stability? 

o To what extent should education be oriented toward vocations , 
to personal development - to the pursuit ot knowledge, to 
screening people by ability categories? > 

Second, we are reticent to address the aims of education because the 
researcher is no more competent to solve these issues than is any other 
citizen. The question is one of values. In any case, we have had to 
recognize these issues because they are inescapable in consideration of*' 
any social policy. 

There are additional limitations on the scope of our work. Be- 
cause we did not have time and resources enough, we could not cover all 
the existing research. In particular: 

o We reviewed very little o^ pre-1950 literature on educati^ 
effectiveness. This meant excluding such classics as ^he Progres 
Education Association^ Eight -Year^St^ and Teraan'^s. work, 

, : • on^gifted* childTen.;'' 

o We reviewed very little of the sociologists’ and political - 
scientists’ research on .educa.tipna.i effectiveness except as 
it related to the organizational |knd experiential approaches . 

We have not reviwed the findings of educational philosophers , 

experiential approach. , 



i 
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Because the measurement of educational outcomes is a central 
Issue In research on educational effectiveness, we discuss measurement 
problems In detail In Section II. Sections III through VII are devoted 
to reviews of the research In each of the five approaches. Section VIII 
summarizes the results and presents our conclusions and policy 
Implications. 
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II. MEASURING EDUCATIONAL OUTCOMES 
ED UCATIONAL OUTCOMES 

Students' educational outcomes are generally divided into two 
categories — cognitive and noncognltlve. Noncognitlve factors in- 
clude motivation, attitudes, learning styles, social skills, self- 
awareness, and even such vague but Important concepts as happiness 
and quality of life. These factors engender two different viewpoints. 
One view contends that noncognltlve factors are Important because they 
are believed to be the major determinant of cognitive achievement; 
evidence presented later in this report supports this view. The 
other view holds that growth in noncognltlve factors is the more 
relevant goal of education. These views are certainly not mutually 
exclusive and most educators agree that noncognltlve factors are 
Important for both reasons. In fact, the distinction between cogni- 
tive and noncognltlve achievement is rather artificial: Attitudes and 

motivation have strong intrinsic cognitive components, and cognitive 
skills and abilities have strong intrinsic noncognitlve components. 

Education in general, and compensatory education in particular, 
is concerned with improving student motivation, attitudes, and general 
affective (noncognltlve) behavior. Generalization of cognitive 
ability results not only from the transfer of specific skills, but 
also from such noncognltlve factors as the estdilishment of learning 

styles, learning sets, motivation for learning, and attitudes ^out 

learning. Noncognltlve factors undoubtedly outweigh the Importance 
of specific cognitive skills for future learning, although acquiring 
cognitive skills may itself consider^ ly affect noncognltlve factors 
such as motivation, self-awareness, and the like. In their book on 
evaluation of learning. Bloom, Hastings, and Madaus (1971) devote an 
entire chapter to measuring aiffecUve behavior and Include affective 
goals in stated educational objectives. Recent research literature, 
especially that related to compensatory and preschool education, 

^See Section IV. 
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repeatedly comments on the importance of noncognltlve factors in de- 
termining cognitive achievement and the necessity of identifying, 
measuring, and shaping these factors at an early age (for example, 
Denenberg, 1970). 

Noncognltlve factors have even greater significance In the light 
of recent evidence showing that the correlation is low between 
cognitive achievement (measured by grades and standardized tests) and 
later life success. Cohen (1970), Glntls (1971), and Holtzman (1971), 
cite evidence indicating that achievement In terms of Job, social 
class, and general life expectations is apparently only incidentally 
related to school achievement. It Is true that a high correlation 
exists between amount of education and amount of Income, but there 
is some evidence that the relationship Is based on arbitrary norms 
unrelated to the content of education (Berg, 1970). Moreover, Glntls 
promotes the thesis that noncognltlve factors have a strong influence 
on worker earnings and productivity. He reviews evidence In support 
of this thesis, and shows that Important dimensions of noncognltlve 
achievement are not promoted or rewarded in most conventional schools. 
Schools need to Include noncognltlve factors In their education ob- 
jectives, and better methods for their evaluation need to be developed. 

Despite the obvious Importance of noncognltlve outcomes, rela- 
tively little research is directed toward discovering their determi- 
nants. Educational effectiveness research is directed almost entirely 
toward explaining cognitive achievement, as measured by standardized 
achievement tests. ^ Most of this section is devoted to a discussion 
of the problems associated with using such tests to measure educational 



^In Section VII of this report we discuss and review much of the 
"reform" literature In education. It should not be surprising that most 
of these authors consider high level cognitive, and noncognltlve factors 
to be the more Important indicators of student learning, and their con- .. 
elusions are rarely based on the results of standardized tests. However, 
reliable measures of these factors do not exist, and conclusions are 
mostly argumentative and based on personal experience. 
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outcomes. Before getting into that discussion, however, we briefly 
consider the problems associated with two alternative measures of 
cognitive achievement ~ teacher grades and essay examinations . 

TEACHER GRADES AND ESSAY EXAMINATIONS 

Teacher grades of students’ performance are extremely unreliable; 
they do not correlate with standardized test scores, and teachers do 
not correlate with each other In grades assigned to the same student 
(Cronbach, 1970). Teacher grades are greatly influenced by student 
characteristics not associated with cognitive performance (docility, 
social class, and so on), and criteria vary from teacher to teacher. 
Grades are further Influenced by school policy factors such as "grading 
on the curve," or community pressure from parents who do not like to 
see their children fail. The technical problems associated with grades 
as a subjective rating system are complex, but they need not be dis- 
cussed here. Grades have played almost no part in the research on 
evaluation of educational outcome. 

Essay examinations are widely used In education, sometimes be- 
cause objective tests canaot be designed to measure some criteria of 
learning. Although essay examinations are widely used, and In spite of 
their advantage In being able to measure broad kinds of cognitive 
ability, the tests are generally not reliable. Answers to essay ques- 
tions vary In several dimensions: vocabulary, style, thought, origi- 

nality, neatness, and others. Thus a single score Is a complex 
weighted sum of the scores on each dimension. Moreover, since sub- 
scores are rarely worked out by the grader, the relative weights vary 
between graders, for the same grader over time, and depending on the 
situation. In reviewing the research on essay examinations, Coffman 
(1971) points out that much research Is still needed in the develop- 
ment of rules for writing and scoring essay questions. None of the 
research reviewed in this report uses essay scores as a measure of 
educational outcomes. 
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STANDARDIZED ACHIEVEMENT TESTS — CONCEPTUAL PROBLEMS 



Incentives 

One very serious criticism of standardized testa is that they 
engender perverse incentives by overemphasizing some outcomes at the 
expense of others • As a result of the increasing interest in account- 
ability, student achieven.\ent is being measured more and more by stan- 
dardized tests,^ with test scores based on national norms. Although 
this practice allows a school to assess itself relative to other 
schools, these tests introduce a number of liabilities and hazards. 
Foremost among these is the danger of suppressing desirable outcomes 
that are not measured by standardized tests (abstract reasoning, 
creativity, and so on). 

Further, although it certainly is necessary and important for child 

ren to acquire basic reading and math skills, focusing on teaching 

these skills may be less important than is often believed. It is 

generally assumed that achievement in basic math and reading skills 

as measured by standardized tests is correlated with, and perhaps 

responsible for, achievement In other subject matters and cognitive 

2 

areas. However, the generalisation of improvement in basic reading 
and math skills through special programs has not been demonstrated; 
although in view of the rather temporary nature of many of the gains 
obtained in these programs, the lack of generalization is not sur- 
prising. Undoubtedly, these skills do generalize under some conditions, 



The most widely used standardized tests measure achievement in 
subject matter areas, sdthough there are also many tests for math and 
reading readiness, concept attainment, psycholinguistlc performance, 
and other general and specific ability tests. In the elementary 
grades, the current programs of performance contracting and accountabi- 
lity have focused almost entirely on measuring these skills. 

Generalization is the spreading of acquired skills to areas in 
which the student has had no specific practice. For example, general- 
ization (or transfer) occurs when an improvement in basic reading 
skills leads to (1) an improvement in concurrent school achievement, 
such as proficiency in social studies or science; and (2) an improve- 
ment in future school achievement, including reading. 



I 
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but the conditions are not known. This point le discussed in greeter 
detail below. ^ 



Derivation of Normative Scores, 

Assuming test items actually measure the amount of learning that 
has taken place in a course of instruction, normative scores are 
necessary to determine what a "raw" test score^ means (cumulated over 
all items). For example, how much "better" is a rav' score of 70 than 
one of 60 (that is, how much more about the course does the student 
know)? How high a score should be "expected"? These questions and 
others are answered by deriving a normative score from actual test 

scores* 

Essentially, the normative score indicates a student's position 
in a distribution of scores. To determine the reference distribution, 
a sample from a specified population is selected and given the test 
(for example, 4th grade children in California) . A given individual s 
raw score can then be represented as higher than x percent of the 
sample scores or as being at the xth percentile. If the sample dis 
tribution is "close" to the population distribution, the percentile 
score represents the student's position in relation to the general 
reference population. Percentile scores can be transformed into 
grade equivalent or other types of normative scores* 

Although grade and age^ equivalent scores are widely used, they 
have been severely criticized (Cronbach, 1970; Angoff, 1971). Equiva- 
lent scores are obtained by administering a test to samples of children 
over the range of desired grades (or age). The average for a grade 



O 
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^See Section IV. 



^A raw score is a measure of the actual number of correct responses 
rhe score nay be a simple frequency count or it nay be the sum of test 
)oints, with each test item given some arbitrary assignment of possible 

;>oints . 



^Age equivalents are most often used with nental abilities tests, 
and th^ report a "mental" age score. The score represents age level 
relative to mean performance on a regression line. 
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(or 50th percentile score) determines the grade level score. For 
example, if a 4th grade student scores at the 50th percentile on a 
3rd grade test, then his grade equivalent score is 3. Normally a 4th 
grade student would take a 4th grade test, and his percentile score 
(say 30) on that test would be converted to a grade level score 
directly. A line is then plotted between the mean score obtained 
by each grade across all grades. This regression of score on grade 
is used to determine a child's grade equivalent score by the simple 
procedure of noting where his score falls on the regression line. If 
the regression of grade on score (rather than score on grade) had been 
used, a different regression line would have resulted, and scores 
would have different grade equivalents (Coleman and Karwelt, 1970) . 
This basic ambiguity is further beclouded by the fact that the Inter- 
pretation of the equivalent score depends upon the variation of scores 
about the mean for each grade In the original sample (that is, the 
variation about the regression line). A child who Is two grades ad- 
vanced on a test of high reliability (low variability about the re- 
gression line) Is also hl^ In his percentile rank (say 95). But, 
if the test were of low reliability (high variability) , the same two 
year advanced status would be associated with a much smaller percen- 
tile rank (say 70). Further, a 6th grader with a 9th grade equiva- 
lent score does not possess the skills of a 9th grader, nor is he 
psychologically the same. Cronbach (1970, p. 98) comments on equiva- 
lent scores : 

In the writer's opinion, grade conversions should never 
be used In reporting on a pupil or a class, or In research. 
Standard scores or percentiles or raw scores serve better. 

Age conversions are also likely to be misinterpreted. A 
6-year-old with mental age 9 cannot pass the tests a 12- 
year-old with mental age 9 passes; the two simply passed 
about the same fraction of the test tasks. On the whole, 
however, age equivalents cause less trouble than grade 
equivalents. If only because the former are not used for 
policy decisions in education* 

These comments represent only the highlights of the problems 
inherent in equivalent scores* For a detailed treatise, the reader 
is referred to Angoff (1971) . 
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An important issue in deriving standardized scores concerns the 
choice of the normative population. A test with national norms is one 
that is supposedly based on a sample representing the normative popu- 
lation across the nation. To be accurate, the sample population 
must be stratified in the same proportions as the overall population, 
that is, Negroes and Caucasians, poor and rich, and so on must appear 
in the sample in the same proportion that they appear in the general 
population. This means that any nationally normed test primarily 
reflects the characteristics of white, middle-class America, simply 
because there are so many of them. 

Cultural bias arises when a test is normed on one population and 
used to test people from another population. The resulting bias can 
be subtle and may lead to gross misinterpretations of data. For 
exanq)le, a nationally normed test of concept ability might be given 
to children from a Mexlcan-American gjhetto. If the test uses written 
test items and instructions, the children's scores are affected by 
their ability to mderstand the language, and if they have language 
problems, their concept ability scores will be poor. Their "true" 
concept ability remains untested. Attempts to develop tests that 
are free from language ability have not been very successful; even 
"nonverbal" tests are frequently found to correlate with language 
ability. 

A more siibtle Influence of the normative population occurs through 
the operation of the values of that population. Standardized tests 
necessarily (because of method of construction) reflect what the 
jjonnatlve population feels is Important. Without great exaggeration, 
one may state that these tests indicate how well students have achieved 
white, middle-class goals. Later in this section we quote a comment by 
Jensen that Illustrates this point In reference to Intelligence tests. 
The problem of cultural bias in testing and emergency social issues is 
discussed by Holtzman (1971) who states (p. 551): 

The emergence of black culture, the Chicano movement, 
and the stirring of the American Indian as well as other 
forgotten groups in the wake of desegregation and civil 
rights legislation have forced white America to re-examine 

f ■ ■ 



- 23 - 



Its soul. The result In the field of mental measurement 
has been a recognition and acceptance of cultural varia- 
bility, a search for new kinds of cognitive, perceptual, and 
affective measures by which to gauge mental development, and 
a renewed determination to contribute significantly to the 
task of overcoming educational and Intellectual deprivation. 

In general, tests designed for normative use lend themselves to gross 
misinterpretation of the abilities of those who are culturally different 
from the majority. 



STANDARDIZED ACHIEVEMENT TESTS — OPERATIONAL PROBLEMS 



In addition to the conceptual difficulties discussed above, there 
are a niuiber of operational problems encountered in the use of stan- 
dardized tests. The UCLA Center for the Study of Evaluation reviewed 
over 1,500 standardized tests used in elementary schools (Hoepfner, 1970). 
Results indicate the tests by and large are unsatisfactory. Klein 
(1971) has written a strong criticism of standardized tests and their 
misuse: 



So, far, the discussion has painted a pretty bleak 
picture regarding the utility of standardized tests for 
accountability. The major problems Involve questionable 
test validity, poor overlap between program and test objec- 
tive'-:, Inappropriate test instructions and directions, and 
confusing test designs and formats. In short, a VOID 
exists between the demands of accountability and the 
present stock of standardized Instruments. Further, 
this void will probably only widen as the pressure for 
accountability Increases unless we start improving the 
methods of test construction and use. [Author's emphasis.] 

Klein's conments are applied *.o accountability, but they are 
also true for educational research based on standardized achievement 
tests in general. The first step in research is accurate measurement; 
and, in this respect, achievement tests are too often misused or mis- 
interpreted. As Anastasl (1967), among others, has pointed out. 
Improvements are needed more in the interpretation of scores and 
orientation of users than in the actual construction of test Instru- 
ments. A number of the technical problems in using these tests %rlU 
be discussed. 
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Educatlonal Object Ives and Test Content 

The apparent failure of many innovative educational programs is 
often attributed to the fact that standardized tests used to evaluate 
the programs do not meeisure outcome in terms of some or all program 
objectives (for example » Cohen» 1970; Klein, 1971; Lennon, 1971). 

Part of the problem is that objectives are rarely stated with suffi- 
cient clarity; but even overlooking this liability, the match between 
program and test objectives is often poor. In the first place, as 
Klein (1971) points out, valid tests covering all of the objectives 
a school might like to attain do not exist. 

Second, testf> may cover some program objectives, but there is 
usually poor agreement between the specific objectives and the test 
content. For example, a test may measure reading ability in terms of, 
say, eight areas. A specific program might be aimed at only six 
objectives, with no Interest in the other two. Most tests, however, 
only report a single score averaged across all areas , and this score 
Indicates achievement on all eight objectives. So a score would be 
a coniblnatlon of how well a student achieved on the six reading program 
objectives, plus how well he achieved on the other two. This makes 
it impossible to evaluate the program. Tests are not designed with 
specific programs in mind, and poor overlap is to be expected between 
the objectives a test measures and those an education program aspires 
to. Another complication occurs when the test does not represent test 
objectives .^ually. Some of these problems would be clarified if the 
tests reported separate scores for each area or objective. 



Test Validity 

Test validity generally means. Does the test measure what it is 
supposed to measure? It is formally determined by a nunher of tech- 
niques.^ One, a complex process called construct validity, essentially 
determines how higjily tests supposedly measuring the same thing cor- 
relate with each other. Low correlation indicates that one or all 



^For a detailed discussion, see Cronbach (1970). 
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of the tests do not measure the construct being considered with 
validity. A second kind of validity is called predictive , and in this 
procedure the test is correlated with an external criterion. For 
example, a test of reading readiness might be validated by using 
success in a reading course as a criterion. The assumption is that 
better readiness leads to better achievement. In practice, both kinds 
of measures are necessary for test validity. A third type, sometimes 
referred to as face validity, simply asks if the items in the test 
appear to measure what the test is designed to measure. Although this 
latter method lacks the sophistication of the first two, many stand- 
ardized tests fail even on this measure. Klein (1971) points out 
several examples in which it is obvious that the test items have 
little to do with what the test purports to measure. There are, in 
fact, many tests that are purposely designed without consideration of 
face validity, although they are not widely used in education. Finally, 
a test is said to have content validity if it measures something that 
some authority asserts that it measures. Much of the foregoing dis- 
cussion on the relationship of objectives to test content relates to 
content validity. The four measures of validity are all methods for 
determining the same thing, and generally several methods are used in 
determining the authenticity of a given test. 

As previously mentioned, tests often do not adequately overlap 
program objectives, and generally they are not valid even when they 
do appear to overlap. In a book on the theory and design of test 
items, Bormuth (1970) criticizes current methods of test construction 
on the grounds that the item generation techniques lead to tests of 
low validity. An item represents the test writer .• response to in- 
structional material, and the student's score is thus a function of 
the test writer and has no known relationship to instructional content. 

Statistical Problems 

Inadequacies in the uie of achievement test scores in educational 
research are partly attributable to the frequent use of faulty 
statistical analyses. By far the majority of studies on compensatory 
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programs report data on achievement galn^ over some period, and per- 
formance contract agreements are almost exclusively written In terms 
of achievement gain scores. Gain scores are extremely biased estimates 
of true gain (for example, see Harris, 1963). An article by Cronbach 
and Furby (1970) offers some refinements on techniques for estimating 
true score; however, the Important message Is that the authors see 
no advantage to using gain scores In the first place. Status scores 
(scores at any point In time) contain all the Information given In 
change scores, at least for the situations In which change scores have 
traditionally been used. For example. If It Is necessary to evaluate 
the Improvement produced by an Innovative program, this Is best accomp- 
lished using a control group. In both treatment and control groups, 
only the final status or achievement score need be used. Pre-test 
scores can be Involved In the statistical analyses, but not In computing 
gains. The groups are not compared with respect to each other. In 
many Instances, It Is mnecessary actually to use an experimental 
control group; Instead It Is possible to use the past history of the 
system as a benciimark. 

Although problems of statistical sophistication and reliability 
are lnq;)ortant , the crucial problems in achievement evaluation are not 
primarily statistical. We agree with Klein (1971) and others that 
there needs to be a rather complete overhaul of testing procedures 
and Interpretation. The shortcomings of standardized tests must be 
accounted for In evaluating education. Efforts to eliminate these 
Inadequacies for future evaluation work will require substantial re- 
search . 



^A student's best performance Is determined by many factors other 
than his "true" knowledge or ability. Because these other factors vary 
over time, a person's test score will also vary, so that any given test 
score Is an estimate of the true state of his knowledge or ability. 

The adileved test score may be a percentile, an age equivalent, or a 
simple sum of correct Items. A gain score Is obtained by subtracting 
& student's score on a test from his score on the same test taken at 
a later time. 
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Criterion Referenced Tests 



Standardized (normative) tests are sometimes criticized because 
their scores do not Indicate the specific skills a student masters. 

They only place him relative to other students, and not relative to 
Instructional content. For example, two students scoring at the 
fiftieth percentile on a reading test could have answered different 
questions correctly and have acquired different reading skills. This 
Is true even If the test gives percentile scores for a number of sub- 
skills; they are still normative scores. This problem Is being 
attacked through the design of so-called criterion referenced tests 
(Cronbach, 1970; Glaser and Nltko, 1970). Each Item on a criterion 
referenced test Is designed to measure or Indicate the accomplishment 
of a particular skill. The number of Items passed Is not the Important 
factor, but rather which Items are passed. The student Is not allowed 
to proceed to advanced Instruction tmtll he acquires prerequisite 
knowledge . 

A key feature of criterion referenced tests Is their relation- 
ship to the specific goals and subject matter of a course. Test 
Items are designed to Indicate success on the learning tasks neces- 
sary to cover the subject matter and to meet the course objectives. 

This requires a detailed task analysis of course material. Few general 
procedures for this task analysis have been developed, although Gagne's 
work on hierarchical organization (1962) shows promise. Section IV 
discusses research on the organization of Instructional material, and 
there we point cut that skills and knowledge required for a course can 
be arranged In a hierarchy, such that success at a higher level depends 
upon acquisition of skills at a lower level. 

The distinction between normative and criterion referenced tests 
Is made primarily on the basis of the purpose for which the test was 
constructed and how Information obtained from It Is used* The purpose 
of a criterion referenced test Is to indicate a student's status on 
a set of specific tasks necessary for the completion of a course of 
Instruction. The test Information not only assesses his accomplish- 
ments but Is also used to determine what tasks the student Is ready 
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to undertake. Normative referenced tests indicate a student's rela- 
tive position in a population, and the information from these tests is 
used to evaluate achievement relative to other students, in terms of 
overall achievement. The use of criterion referenced tests for this 
purpose is not clear since such tests indicate which instructional 
tasks the student has accomplished; essentially, he passes or he does 
not for each task. The number of tasks he "passes" cannot be meaning- 
fully added for a total test score. Criterion referenced tests serve 
diagnostic functions in evaluation, which alms at special information 
for student remediation or course improvement. 

Much work remains to be done in developing criterion referenced 
tests but they appear to have great promise. Their greatest potential 
value is that they focus on instructional consent, yield information 
for remediation, and allow for individual differences in performance. 

GENERAL INTELLIGENCE TESTS 

General Intelligence tests are standardized achievement tests. 

They have been developed over a longer period than most standardized 
achievement tests, and more research has been directed toward their 
Improvement: They are more valid when properly used; they usually 

report subscores on various test objectives; and directions for 
administration are generally better. Sometimes changes in IQ scores 
are used to measure student achievement, and many attempts have been 
made to Improve IQ scores through compensatory school and preschool 
programs. Failure to find consistent evidence that IQ can be modified 
(for example, Butler, 1970) led Kohlberg (1968), among others, to argue 
that IQ is not a good measure of the efficacy of these programs. For 
years, psychologists have stated that many IQ tests are mostly achieve- 
ment tests. They measure what the person has learned, not primarily 
his capacity for learning. The scores reflect environmental influences 
and past learning as well as Innate ability. The belief that IQ can 
be affected by environment has been confirmed many times in studies 
of identical twins, but many factors contribute to this effect other 
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than those present in the school environment (Vandenberg, 1966). 

On the other hand, Jensen (1969) reports evidence that IQ is largely 
determined by genetics, and can only be modified by environment in a 
relatively small degree. 

The various uses of IQ tests in recent education programs has 
caused a re-emergence of debate and inquiry into the validity and 
meaning of general intelligence test scores. The crucial factor in 
determining the appropriateness of their use (or any achievement test) 
depends on the goals and objectives the test is being used to evaluate. 
This is never an easy task and is made even more difficult by the 
interaction of social values and subtle and nonverballzed goals that 
exert profound influence on test content, scores, and Interpretation. 
This has been well stated by Jensen (1970): 

It should not be forgotten that intelligence tests as we 
know them evolved in close conjimction with the educatloiisl 
curricula and instructional methods of Europe and North 
America. Schooling was not simply Invented in a single 
stroke. It has a long evolutionary history and still 
heavily bears the imprint of its origins in predominantly 
aristocratic and upper-class European society. Not only 
did the content of education help to shape this society, 
but, even more, the nature of the society shaped the con- 
tent of education and the methods of instruction for im- 
parting it. If the educational needs and goals of this 
upper segment of society had been different, and if their 
modal pattern of abilities — both innate abilities and 
those acquired in these peculiar environmental circum- 
stances — were different, it seems a safe conjecture 
that the evaluation of educational content and practices 
and consequently the character of public education in 
modem times would be quite different from what it is. 

And our intelligence tests — assuming we have them under 
these different conditions — would most likely also have 
taken on a different character." 



SUMMARY 

Using standardized tests to evaluate student achievement has 
become a major enterprise in the schools; but in spite of the wide 
tise and reliance on these tests, they are generally Inadequate. This 
is alarming in light of the growing activity in evaluation of educational 
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outcome based on standardized test scores* Standardized tests, even 

when properly used and Interpreted, evaluate only a United nunber 

of educational objectives. At best, generally used tests measure 

only limited aspects of cognitive performance, while higher cognl- | 

tive abilities and achievements go untested. Noncognltive achieve- | 

ment is sometimes talked about, but the evaluation of these factors 

is still In a very crude state. Inasmuch as schools and innovative 

education programs are being evaluated in terms of such limitations, 

I there is a crucial need for Immediate improvements In test design, 

1 concept, scoring interpretation, and administration. 
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III. THE INPUT-OUTPUT APPROACH 



In this section we review the results of a number of studies of 
educational effectiveness In what we have called the Input-output 
approach. These studies are distinguished by a view of the educational 
process that holds a student's educational outcome Is determined by the 
quantities of resources his sdiool makes available to him; by the per- 
sonal, family, and community characteristics that Influence his learning 
typically grouped under the term "background factors" — and by the In- 
fluences of his peers. In this approach the school In which the student 
Is enrolled affects his outcome only to the extent that It serves as the 
channel through which resources flow to him. In particular, the struc- 
ture and organization of the sdiool and classroom are neglected. 

The educational "production function" Is a formal representation 
of the relationship between school resources and background factors on 
one hand, and student outcomes on the other. It Is commonly expressed 
In the form of a mathematical relation or equation: 

(1) 0 * •••» •••* fjj» •••» Pj^) 

where there are assumed to be n relevant school resources, m relevant 
background factors, k relevant peer group Influences,^ and: 

0 ** a student's output — for example, his score on a 
standardized achievement test; 

r^, ..., r^ » the amounts of school resources 1 through n, respectively, 
that he received — for example, resource 1 might be the 
ability of his teacher, resource 2 the size of his class, 
and so on; 

^1* ***' ^ffl * amounts of badeground factors 1 through m, respectively 
that the student has been exposed to — for example, f^ 
might denote his family's Income, {2 his father's occupa- 
tion, and so on; 

^Some researchers prefer the term "student body effects." 
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« the amounts of peer group Influences 1 through k, 

1 k 

respectively, that the student has been exposed to -- 
for example, p^^ night denote the proportion of his class- 
nates that Intend to go to college, ^2 proportion of 
his classmates that are menbers of minority groups, and 
so on. 

The educational production function is expressed in Its most general 
form in Eq. (1), which merely states th^^t for any particular student, 
described in terms of his background factors, the amounts of school 
resources he receives and the Influences of his peers determine his 
outcome. In order to make a quantitative estimation of the iiopact of 
any particular resource upon outcomes, the precise relationship between 
irputs — resources, factors, and peer group influences — and outcomes 
must be specified. Conceptually, any one of an infinitely large set of 
possible relationships can be specified. In practice, however, only 
one functional form — the linear one — has thus far been employed in 
educational production-function studies. But this is more a reflection 
of the limitations of current statistical techniques than the resiat of 
any consensus ^out the underlying nature of the educational process. 

The linear production function assumes that each mit of a particular 
school resource or background factor or peer group influence contributes 
a constant amount to student outcome. The unit contribution of any one 
input does not vary with the amount of that input the student receives, 
nor with the amounts of any of the other inputs the student receives. 

More formally, this specification of the production function can be 
expressed as in Eq. (2): 

(2) 0 » a + b^r^ + ... + b^r^ + c^^f^^ + * * * + + Vl *** Vk * 

As before, 0 denotes the student’s outcome, r^ denotes the amount of 
the ith school resource the student received (1 ** 1, . . . , a) , denotes 
the amount of the itb background factor (1*1, ..., m), and p^^^ denotes 

the amount of the itli peer group influence (i * 1 k) , b^ is the 

unit contribution of the ith school resource, the unit contribution 
of the ith background factor, and d^^ the unit contribution of the 1th 
peer group influence. 
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Equation (2) can be Interpreted as follows. Suppose a student 
were to receive r^ units of the first school resource. If each of these 
units contributes to his outcome, independent of the quantities of 
any other inputs he receives, the total contribution of the first school 
resource to his outcome is b^ times r^. An identical argument would 
show that the total conttlbution to outcome of any other school resource, 
say the ith resource, is b^ times r^. Similarly, if the student is ex- 
posed to units of the ith background factor, (peer group influence) , 

the total contribution of that factor (influence) to his outcome will be 
times times p^). Since the contributions are independent* of 

one another, and every input that influences a student's outcome is 
presunably Included in Eq. (2/, we need simply add them together to 
determine a student's outcome. (The first term on the right-hand side 
of the equation, a, is a normalising constant that need not concern us 
here.) For example, Riesling (1969) has fitted the following equation: 

0 * 2.26 - .012 r^^ - .0065 T 2 *♦* .0013 r^ 

- .00065 r^ + .0017 r^ + .127 fj^ 

where 

0 “ Composite score on Iowa Test of Basic Skills for an urban 
school district 
r^^ « Teachers per pupil 

T 2 * Expenditure on books and supplies per pupil 
r^ ** Teao'ier salary 

r^ “ Value of school-owned property pet pupil 

r^ « Expenditure on principals and supervisors per pupil 

f^ = Index of occupation of adults in district. 

OBJECTIVES AND METHODS 

The objective of research, in the present case, is to estimate the 
numerical values of the b's, c's, and d's that appear in Eq. (2). If ve 
knew these values, we could predict the impact of providing students 
with more or less of any particular school resource. This would allow 
us to determine whether increasing (or decreasing) the amount of any one 
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schooi resource would effect students • outcones more or less then In- 
creeslng (or decreeslng) the smount of eny other school Input. Teking 
eccount of the reletlve prices of the various school resources, we could 
then determine how such of eich school resource should be purchased to 
attain any particular goal for student outcomes at sdnlM coat. In 
short, we could formulate optimal educational policies. However, the 
costs of ohtelnlng school resources have not yet been Incorporated In 
empirical analyses. Estimates of educational production functions - 
the topic to which we devote the remainder of this section — are only 
the first step toward an educational policy. 

Multiple regression analysis is used to estituate the values of the 
coefficients - the b*s. c’s. «nd d's - In E,. (2). Detail, of the ted.- 
nlq- ! can be found In «ry statistics text. A multiple regression uialy- 
sls provldea for tests of the "significance" of the empirical results. 
These are formal measures of the «=curacy of the results In the sense 
that they indicate how much confidence csn be placed In then. In educa- 
tional production-function studies the ..alyst Is typlcaUy concerned 
with Identifying resources or factors that effect student outcomes. In 
terns of Eq. (2), he Is concerned with Identifying Inputs where coef- 
ficients have non-aero values. To st, that the coeffiaent of a variable 
is significant means that the test of significance indicates a small 
probd.lUty that that particular coefficient Is aero. Just how small 
is referred to as the significance 1ml • 

The basic assumption underlying aU studies In the Input-output 
approach Is that the production function Is an equally accurate 
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discussion of this issue see Caio and Watts (1968) . 

^See for example » Wonnecott and Wonnacott (1970) . 




description of the educational process for all students, or at least 
for some identifiable subgroup of students. In other words, the unit 
contribution of any given resource, factor, or influence to student 
outcome is assumed to be approximately the same for all, or some sub- 
group of, students. This assumption implies that If any particular 
resource or factor does have a significant impact on student outcomes, 
the coefficient of that resource or factor should be significant in any 
study that examines it. Otherwise, every student must be different or 
respond differently to the same resources.^ 

There is always some possibility that a variable that appearr to 
have a significant impact upon student outcomes may. In fact, be unre- 
lated to outcome. It is therefore clear that educational policy cannot 
be based on the results of any one study. The basic assumption of 
production-function analysis reinforces this point. We do not emphasize 
the results yielded by any one study. Rather, our primary concern is 
to identify results that consistently appear throughout a number of 
studies. 

VARIABLES 

Educational ‘researchers have, at one time or another, investigated 
a large number of student outcomes, school resources, and background 
factors. It would be futile to attempt to list them all here. In ordd^' 
to convey some feeling for the sorts of variables that are investigated 
in educational production-funnction studies, we will describe some that 
appear most often. Appendix A contains complete lists of variables for 
each of 18 major studies in this approach. 

Student outcomes are most often cognitive achievement, measured by 
scores on standardized reading or mathematics achievement tests. Drop- 
out rates or ’'holding rates" — the latter is defined as one minus the 
dropout rate — are occasionally examined. Less frequently included in 
student outcomes is some measure of college attendance or intention to 
attend. Recently, researchers have begun to investigate students* attitudes 

^See Section IV for a discussion of this possibility. 
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as outcomes of education; but, by and large, the noncognitlve domain 
and much of the cognitive domain’ remain unexplored. 

School resources virtually always include measures of the "quality" 
of the school's faculty. Average teadiers* experience, salary, degree 
level, and verbal ability are the four most common. Average class size 
or student— teacher ratios appear often as well. Measures of the physical 
plant or facilities of the school are also generally included in educa- 
tional production-function studies. The age of the school buildings and 
the nitmbpr of library books per student are examples. 

BackKro»*"d factors Include measures of the socioeconomic status of 
the students' families or of the communities their school serves. Average 
family income, father's (or mother's) education, and father's (or mother's) 
occupation are typical. The racial composition of the community and 
whether the community is urban or rural are examples of community factors. 

Peer group influences include measures of the educational attain- 
ment and aspirations, the attitudes and motivations of a student's class - 
mateJ. The percent of his class that Intend to enter college, the 
proportion of his class whose families ovn encyclopedias, and the 
attendance and transfer rates of his classmates are typical measures 
of a peer group's influence on a student. 

ANALYTICAL PROBLEMS 

The educational researchers who have worked in the input-output 
approach have been plagued by many severe analytical problems . Before 
presenting the results of these studies, we alert the reader to the 
limitations of this research approach. 

The most serious difficulty faced by the production-function 
approach is rooted in the sorts of data used in the empirical analyses . 

No production— function studies of educational effectiveness have been 
based upon observation of true experiments. Rather, they have relied , 
upon so-called "natural experiments" for their en5>irical content. 




f 



1 



See Section IX. 





53 



By a nat cal experiment we mean u ultuatlcm created by chance or 
coincidence, from the researcher's point of view, in which basically 
similar individuals have been subjected to different stimuli.^ By 
analyzing their responses, the researcher hopes to discover how indi- 
viduals in general will respond to the various stimuli. In education, 
for example, a natural experiment would occur if students at the same 
grade level from identical backgrounds and subject to identical peer- 
group Influences were to attend different schools and thus receive 
different amounts of various school resources. An analysis of this 
situation might reveal whether differences in the students' outcomes 
were systematically related to differences In the amounts of the re- 
sources they received. Another natural experiment would occur if 
students from differing backgrounds were to attend the same school at 
the saro grade level, be subjected to the same peer-group Influences, 
and receive identical amounts of every school resource. Analysis of 
this situation could show the extent to which differences in their out- 
comes systematically varied with the differences In their backgrounds. 

But students come from a wide variety of backgrounds, attend dif- 
ferent schools, and, even within the same school at the some grade level, 
may receive substantially different amounts of each school resource. 

Thus the researcher is faced with an extremely complex natural experiment. 
Subject to In^ortant li.mltatlons, multiple regression techniques can deal 
with such a situation, at least so far as the data generated by this 
convoluted experiment are amenable to analysis. But the data often 
Impose serious limitations on the analysis. 

Individual schools tend to serve relatively homogeneous populations. 
The students in any one school generally live in the same neighborhood 
and are subject to the same community Influences. Further, their families 
are apt to be similar in terms of social and economic characteristics. 
Hence, a student's background is likely to be quite similar to the back- 
grotmds of his peers. The levels of various school resources also vary 

^e sltitatlon may have been deliberately caused by individuals or 
groups of individuals for their own purposes. The point Is that the 
situation was not brought about to meet the researcher's needs. 
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from one school to another. As a result, we observe that students' out- 
comes systematically vary with simultaneous variations in school resources, 
peer-group influenc?is, and students' backgrounds. Under these circum- 
stances we generally cannot separate the part of the variation in outcome 
due to variation in school resources from the parts due to variations in 
students' backgrounds or peer-group influences. 

In most cities, for example, new school buildings are located in 
the urban fringe, serving predominantly middle- and upper-class com- 
munities. The older schools are found in the older sections of the city, 
often in poverty areas. If students in the newer schools have systema- 
tically higher outcomes, by some measure, we could observe that students 
from middle- and upper-class backgrounds who attend the newer schools do 
better than students from poverty backgrounds who attend the older 
schools. But we could not determine whether the former performed better 
because they came from more advantaged backgrounds, because they attended 
schools with new buildings, or because their classmates come from more 
advantaged backgrounds. 

A second major problem that confronts researchers using the input- 
output approach stems from data aggregation. The researcher would like 
to examine the relationship among the school resources an Individual 
student receives, his background, and the influences of his peers on 
one hand and his educational outcome on the other. But data are almost 
never available in such detail.^ The researcher generally has data 
available only in much more aggregated form. For example, a researcher 
might wish to investigate the extent to which a teacher's experience 
affects the outcomes of his students. Ideally the researcher would col- 
lect outcome data from students who had different teachers and analyze the 

2 

relationship between student outcome and teacher experience. If the 
data do not permit him to identify the particular teacher each student 

^anushek (1970) is the only analysis conducted on this level of 
detail. 

^In such an analysis variables measuring other school resources, 
background factors, and peer-group influences would have to be included. 

We neglect unest variables in order to focus on the main issue. , 
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had, he cannot, of course, conduct the study. What researchers often 
do in these circumstances is to collect data from students in a moaber 
of different schools (or even districts) and investigate the relation- 
ship between a student's outcome and the average level of teacher exper- 
ience in his school (or district). 

The problem here is that if a teadier's experience does in fact 
affect his students' outcomes, a considerable amount of information is 
lost. Within a school there would be considerable variation in students 
outcomes caused by variations in their respective teaclicrs' amounts of 
experience. But this variation is averaged out in the aggregate data 
and cannot be Investigated in an analysis that uses aggregate data. 

Roughly 30 percent of the variation in students' outcomes is 
variation atnong schools. Thus , an analysis of individual students' 
outcomes that uses school resources or peer-group Influence date aggre- 
gated to the school level can account for about 30 percent of the 
variation in students' outcomes. Analyses that use data aggregated 
to the district level are even more restricted because the variance in 
students' outcomes between districts is smaller yet — even more infor- 
mation is "averaged out" of the analysis. 

RESULTS 

In reviewing educational production-function studies, we surveyed 
the literature in a nunber of different fields. Education, economics, 
sociology, and public policy have all Included such analyses in their 
domain. From this literature we selected a number of studies for care- 
ful and detailed examination. Two criteria were used in the selection 
process. First, we chose for detailed review only studies that examined 
the impact of a school resource, simultaneously taking account of the 
Impact of other school resources and background factors. Second, we 
neglected studies that grossly misused statistical estimation procedures 
The results presented below derive from our examination of the reports 
that satisfied these criteria.^ 

^Appendix A presents a detailed summary of each report revlev»ed. 
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Overall 

Considering first the overall results of these studies, we generally 
find that estimated production functions seldom explain students out- 
comes very well. This finding is based on an examination of what are 
coimnotay tented "goodness-of-fit" statistics. Intuitively, we can view 
goodness-of-flt statistics as estimates of how accurately ve could pre- 
dict a student's outcome using the results of the production- function 
analysis. Suppose we knew no more than a student's grade level and 
were asked to predict hew well he would perform on a standardized 
achievement test. The best estimate we could make would be the nean 
score achieved on that test by students at that grade level. Now sup- 
pose thrt we had a complete description of the student's background 
factors, peer— group influences., and the school resources he has re“ 
celved. If we used this information in a production function to esti- 
mate his performance on the test, and if our prediction were perfect, 
we could say that the function was 100 percent accurate. On the other 
hand, if our estimate based on the production function were no more 
accurate than the estimate we would make in the absence of that informa- 
tion, we would say that the function was 0 percent accurate. In these 
terms, production- function studies are rarely better than 15-20 percent 
accurate; and are often far less accurate.^ In sum, although the pro- 
duction functions estimated thus far are helpful in understanding 
sttident outcomes, the amount of help they offer is relatively small. 



Peer-group Influence 

The debate over the Importance of a student's peers is illustrative 
of the analytical problems encountered in product ion- function analyses 



^Formally, the goodness-of-fit statistic, or "r "to use the standard 
notation, is the percent of total variance in students outcomes that is 
attributed to the variance in the explanatory variables — resources, in- 
fluences, and factors. But, as indicated above, the total variance in 
students' outcomes between schools is about 30 percent of the totAl 
variance in students' outcomes. Thus, an analysis that uses aggregate 
data “ as all but one do — may report an r^ of, say, 0.50. That ^ans 
that 50 percent of the variance in students' outcomes between schools is 
"explained" in the analysis. But that is only 15 percent (.50 x .30} of 
the total variance in students' outcomes. 
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of educational effectiveness. ' To demonstrate the sorts of difficulties 
that stem from these problems we trace the debate chronologically. 

Student-body effects were not examined in the context of production- 
function research prior to the Coleman Report (1966). That study included 
the following results: 

0 A pupil's achievement is strongly related to the 
educational backgrounds and aspirations of the other 
students in the school (p. 22). 

0 There is evidence, even in the short run, of an effect 
of school integration on the reading and mathematics 
achievement of Negro pupils (p. 29).^ 

These results followed from an analysis showing that, in terms of the 
concepts introduced earlier,- a production function that included varia- 
bles measuring the background of the student body could predict a student's 
outcome significantly more accurately than one that did not. 

Bowles and Levin (1968) examined the Coleman Report in some detail / 

and disputed many of its findings. In particular, they questioned the 

two results cited above. Coleman did not have an opportiinity to observe 

the behavior of poor children who attended majority poor schools and then 

2 

transferred to majority middle-class schools. Instead, he had to rely 
upon natural experiments. Specifically, Coleman compared the outcomes 
of poor students who attended majority poor schools with the outcomes 
of poor students who attended majority middle-class schools. His results 
stem from the apparently superior performance of the latter, even after 
controlling for the school resources they received and their backgrounds. 

Bowles and Levin point out that predominantly poor schools tend to 
serve communities that are substantially different from the communities 
served by predominantly middle-class schools. Thus, poor students who 
attend predominantly middle-clasS schools come from families and live in 

^Note that integration is a particular variant of peer-group in- 
fluence insofar as educational effectiveness is concerned. 

2 

The awkward term "majority poor" is used here to describe schools 
where the families of a majority of the students are poor. Related 
terms such as "majority middle-class" and "predominantly black" are 
similarly defined. 
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communities that are quite different from the families and communities 
of the poor students who go to predominantly poor schools. In short, the 
background factors of students in high-aspiration, high-educational- 
backgroimd schools may catAse them to perform better, and not merely the 
Lfact that they are in such schools . 

Coleman's finding with respect to integration is also questioned 
by Bowles and Levin. They point out that differences in emphasis exi'.st 
in various sections of the Report. "And in fact, Coleman has emphatically 
stressed that the survey revealed no unique effect of racial composition 
on the achievement levels of nonwhites" (Bowles and Levin, 1968, p. 22). 
But v;e note that on this point Bowles and Levin do not refute Coleman. 
Rather, they argue that alternative interpretations of Coleman's empirical 
results are as likely to be valid as Coleman's interpretations. 

Bowles (1969) has conducted a production-function analysis using a 
different body of data — the Project TALENT data file. He has found 
that "a measure of the social class and achievement levels of the school 
...is not significantly related to black achievement" (p. 72). Bowles 
also suggests that apparent student-body effects are very likely to stem 
from the difficulty of identifying the contribution of a student's back- 
grotind factors to his outcome in complex natural experiments. 

Smith (1971) hais made a complete re-analysis of the Coleman data. 

Like Bowles and Levin, he disputes many of Coleman's findings. Again, 
we limit our discussion to Smith's findings with respect to the student- 
body effect. Smith argues that Coleman made a mechanical error in his 
analysis of the individual's background. In essence, the wrong variables 
were entered into the empirical study: 

This mechanical error affected the strength of the relation- 
ship between individual verbal achievement and the Student 
Body factor more than any other relationship.... Ibe Report's 
estimates of the amount of achievement variance explained by 
the Student Body factor are severely reduced when the intended 
background controls are used .. (Author's emphasis, pp. 63-65.) 

Smith goes on to argue that in one of these mechanical errors the 
percentage of high school students taking college curriculum was erron- 
eously entered into the empirical analysis in place of the percentage who 
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intended to go to college. This variable played an Important role 
natabUshlns the significance of the stndent-hody effect. » 
interpreted this variable as a manure of the aspirations of the 
body. He felt that Its significance in explaining student 
indicated that students who attend schools where the student body has 
; high aspirations perform better than otherwise sirilar students who 

u ». >»o<1v has lower aspirations. Hence, 

attend schools where the student body has lower a*,F 

there is a student -body effect. 

Srith points out. however, that Coleean's data were collected free 

acaderic. vocational, and comprehensive high schools. ^ 

original analysis did not distinguish among the three ^ere is a 
selection process whereby students are assigned to schools on the b^l 

their presumed Gilley. And the proportion of a high schoo s students 
ll liege curriculum may simply he a .asure of whether high (presold) 
ability students are assigned to that school. Hence, Smith “8uas. 
proper interpretation of Coleman’s empirical results 
assigned to schooW for pupils of high ability perform better than 

the result of an assignment process, not a student-body effect. 

Inmmary. there is 

have a s -roov independent influence on the verbal _ac h l eveme n t _s_ 
students ." (Author's emphasis, p. 76.) 

Our review of the evidence as to the existence of peer-group in- 
fluences suggests four main conclusions; 

1 There is no strong evidence that student-body effects exist. 

in particular, there is no evidence that the racial composrilon 
of a student body affects the performance of indivl u me 
of that student body. 

2. There is no strong evidence to the contrary. Many researchers 
have argued that alternative and more likely hypotheses «uld 
have led to the results' being Interpreted as student-body 

1 -oVipr has shown that sttident-body erfects 
effects. But no researcher has snown 

do not exist . 
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3. There is no evidence in the production-function literatxire that 
student-body effects might be negative. 

4, The entire controversy over the existence of student-body 
effects and the absence of conclusive empirical results stem 
from the data problem described earlier. So long as production- 
function research is based on data generated by natural experi- 
ments , it will be difficult, if not impossible, to isolate 
completely the relative contributions of school resources, 
background factors, and peer-group influences. 



School Resources 

Our examination of the production-function literature suggests two 
findings with respect to school resources: 

0 School resources are seldom important determinants of student 
outcomes • 

0 No school resource is consistently related to student outcomes. 

The first finding can be intuitively expressed in the following 
terms. Suppose we knew what resources a student had received but nothing 
about his background or the backgrounds of his fellow students. Using 
this information ir a production function, we could predict the student's 
outcome with only slightly more accuracy than if we knew only his grade 
level. In rough terms, knowing what resources a student received would 
allow us to predict his outcome about 5 percent more accurately. 

On the other hand, suppose we knew both the student's background 
and the resources he received. Suppose, further, that we "controlled 
for the influence of background factors by examining how much more 
accurately (as compared with knowing only a student's grade level) we 
could predict his outcome on the basis of his background and then asked 
how much further accuracy we could get if we added our knowledge of the 
school resources he received to the prediction. In this case school 
resources would add roughly 1 percent to the accuracy of our prediction. 

The difference between these two numbers — 1 percent and 5 percent 
stems from the analytical problem described earlier. There is considerable 

.. 
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overlap between students’ backgrounds and their school resources. If we 
consider only school resources, the influence of the overlap is entirely 
attributed to the resources. If we consider only background factors, the 
influence of the overlap is entirely attributed to them. Finally, if we 
consider only background factors and attribute the influence of the over- 
lap to them and then add school resources, the resources are attributed 
none of the overlap. In short, we can be sure that school resources 
contribute between 1 and 5 percent to our prediction of student outcomes. 

Thus far we have focused on the overall contribution of school 
resources to student outcomes. Almost every study finds one or two or 
three school resources to be significantly related to student outcomes . 
But these studies generally examine a large number of school resources. 
Along with the two or three resources that are found to be significant 
many are found to be insignificant. And, when we compare the results of 
various studies, we find that the same resources do not appear among the 
lists of significant variables studies have compared. For that matter, 
it is not unusual to find a research report in which the students have 
been divided into a number of groups by some stratification rule, with 
separate analyses yielding distinctly different results with respect 
to the significance of school resources for each group. To summarize: 

0 There is no strong evidence that any particular school resource 
is an important determinant of educational outcomes. 

0 Neglecting the issue of which school resources are important, 
there is no strong evidence that school resources in general 
have a significant impact on educational outcomes. 

Background Factors 

Two results concerning the effects of background factors emerge 
from the analysis: 

0 Background factors are always important determinants of educa- 
tional outcomes. 

0 The socioeconomic status of a student’s family and community is 
consistently related to his educational outcome. 

^For an extended discussion, see Mayeske et a_3^. (1969). 
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In terms of the intuitive notion of predictive accuracy we have 
been \ising, we could predict a student's outcome roughly 15 percent more 
accurately if we knew his family's socioeconomic status. Further, in 
every study, the socioeconomic status of the student's family and of the 
community in which he lives proves to be significantly related to his 
outcome. 

All in fill, then, the production functions estimated thiis far enable 
us to use information regarding a student's background and the services 
he received from his school to predict his outcome somewhat more accurately. 
However, this improvement in accuracy comes, for the most part, from our 
ability to take account of a student's background in making our predic- 
tion. Knowledge of the resources the student received has proved to be 
of minor value. An obvious implication of this argument is that, if 
knowing the amounts of the various school resources a student has re- 
ceived does not enable us to predict his outcome more accurately, we have 
little reason to believe that receiving these resources has had much impact 
upon his outcome. 

RESEARCH PROBLEMS 

The researcher who attempts to estimate an educational production 
function encounters problems on many levels. One serious problem is 
that we may not even be asking the right questions. What is it that we 
are trying to accomplish? As an^ example of this sort of problem, con- 
sider the concept of out-of-school learning. Many researchers have 
argued that students spend a relatively small proportion of their time 
actually in classrooms supposedly learning something. It is quite pos- , 
sible that considerable learning goes on out of school. Thus, schools 
may be making a tremendous difference; but if this difference is still 
small in comparison with total learning, it is hard to Isolate. 

Even if \re are asking the right questions, we may encounter serious 
substantive problems. Consider, for exfmple, the possibility that the 
production function is student-specific. Suppose that different students 
have different learning patterns and that the importance of any particular 
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resource varies with the learning patterns of a student. Then these 
resources could be extremely Important to some students. But because 
the production function essentially averages over all students, this 
student-specific relationship goes unnoticed. 

There are nxjmerous methodological problems. Many researchers have 
pointed out, for example, that schools aim at more than one outcome. 

They do not merely aim to teadi a student how to perform well on some 
standardized achievement test. At the very least, they are interested 
In teaching reading and mathematics, minimizing dropouts, and Im- 
parting a number of noncognltlve skills. Schools may be using their 
resources with different emphasis with respect to outputs. Suppose, 
for example, that we compared four schools and that In one school the 
teachers spent all their time teaching reading. In another school they 
were all emphasizing mathematics. In a third they were all behaving as 
j.al.lers and trying to keep the students out of trouble, and In the fourth 
they were all looking toward various noncognltlve skills. When we 
examine the relationship between reading achievement and use of teachers 
In these schools, we are not apt to find a significant relationship. 

The statistical method of handling this sort of problem Is termed 
simultaneous equations. There have been some attempts — a very few 
at tislng these techniques, but they have not been very successful. In 
general, there is good reason to believe that our statistical techniques 
have just not been up to the kinds of problems we are addressing. Further- 
more, these statistical techniques — — In particular, their limitations 
are seldom well mderstood by the people using them. 

Finally, there are many straightforward measurement problems. We 
are trying to measure extremely difficult things In educational research. 

We may believe that the ability of a teacher to teach Influences what 
his or her students learn. But no studies in this approach have yet 
used any direct measure of teaching ability. Instead, they have used 
proxy variables, such as a teacher's salary or verbal ability or experience. 
But if more experienced teachers are not better teachers, and if higher 

* 

^See Section IV for evidence that this is so. 
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paid or more verbally facile teachers are not better teachers, then ex- 
perience or verbal ability or salary will not yield significant results. 

But that does not mean that teaching ability has no impact upoI4^student 
outcomes . 

SUMMARY 

Research into educational effectiveness by means of the input- 
output approach has not, as yet, yielded consistent results regarding 
the importance of school resources. Background factors tend to dominate 
the results. No single resource consistently appears to exert a power- 
ful influence on student outcomes. Some school resources appear to be 
important in each study, but the same resources appear to be unimportant 
in other studies. In fact, there is very little evidence that school 
resources in general have a powerful impact upon student outcomes, even 
neglecting the question of which school resources are influential. 

This body of research has, as a result, not identified what parti- 
cular resources should be provided to students. It has yielded one 
important policy implication. The resources for which school systems 
have traditionally been willing to pay a premium — teachers' experience, 
reduced class size, and teachers' advanced degrees — do not appear to 
be of great value. Inexperienced teachers do not appear to produce 
students whose outcomes are significantly worse than the outcomes of 
students whose teachers are experienced, other things being equal. 
Similarly, students whose teachers have advanced degrees or who are in 
small classes do not do better, other things being equal, than students 
of teachers lacking advanced training or attending large classes. 

It must also be emphasized that these results should not be inter- 
preted as indicating that school resources do not affect student outcomes. 
We can only observe that these studies have failed to show that school 
resources do affect student outcomes. The difference between these two 
points is a reflection of the problems encountered in doing research in 
the input— output approach. There are many fundamental difficulties xn 
this research approach, any one of which could have led to the incon- 
clusive results cited above. And, of course, there is no way to determine 
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whether the absence of results steins from the absence of an underlying 
relationship between school resources and student outcomes or from a 
research method that could not find results even if they were actually 
there . ^ 



For a general discussion of this point, see Levin (1969). 




IV. THE PROCESS APPROACH 



The general purpose of research on the process approach is to im- 
prove our understanding of the way in which education takes place and to 
determine factors affecting educational outcomes. A wide variety of re- 
search Interests are relevant to understanding the educational process. 
Studies of teachers' characteristics (skills, behavior, personality, and 
the like) are obv5.ously relevant, as are studies of teaching methodology. 
Basic psychological studies of learning are relevant as well, but few 
results are directly applicable to the classroom. Perhaps most Important 
In the long run are psychological studies of learning in instruction, 
individual differences, child development, and personality; these studies 
are beginning to define student characteristics and Instructional prac- 
tices that are crucial in determining educational outcome. 

This review of research covers studies of the educational process 
as undertaken in the classroom, as well as studies made in the psycho- 
logical laboratory that appear to have relevance for the educational pro- 
cess. Laboratory and classroom studies are distinguished in this report 
not so much on the basis of where the study took place as on the basis of 
the study objectives, the learning tasks, and the kinds of outcome mea- 
sures. Classroom studies Involve meaningful teaching activities and have 
the objective of improving our understanding of education in the class- 
room. Some measure of educational outcome is generally used (achievement 
tests, grades, and teacher or supervisor ratings). Laboratory studies 
generally have more theoretical objectives such as advancing knowledge 
about psychological phenomena, testing theory, or investigating empirical 
relationships between psychological variables. In these studies, measures 
of outcome are varied and difficult to summarize. They are, however, 
generally based on the learning or retention of well-defined and highly 
specific responses. The experimentalist is not primarily concerned with 
the amount learned, but with the way in which the learning takes place 
and the factors that affect learning or retention. For example, an 
experimenter might present both auditory and visual stimuli in pairs 
to children to Investigate the different effects of each type of stimulus 



on learning and retenticn. The stimulus pairs are presented to each 
child until he can recall without error the second stimulus in each pair 
upon presentation of only the first. The measure of learning is the 
number of presentations necessary before the child has learned the list 
of stimulus pairs without error. This measure is studied across age 
groups to determine whether age-related differences exist in the learning 
of visual or auditory stimuli. 

The reader should realize at the outset that classroom and laboratory 
studies differ greatly in their objectives and approaches. Classroom 
studies have not generally produced highly definitive results. Labora- 
tory studies, however, have produced many significant and consistent 
results, but their relevance for classroom learning Is often not clear. 

Thousands of studies relevant to education are published each year. 

To review them all would be impossible within the time and resources 
available. Fortunately, there are a number of review articles and books 
in each of the areas of concern covering broad areas of research. Some 
of these reviews merely summarize a large number of studies; consequently, 
one must go to the original sources. Other reviews criticize and analyze 
as well as summarize. Some relate studies to one another and to basic 
issues in methodology and education. These reviews are easier to read 
and comprehend, although there is the risk of being swayed by the par- 
ticular orientation of the reviewer. To cover a wide range of educational 
research and to give the reader a comprehensive view of the vast area 
of process research, we have drav^n tjpon analytical review articles in 
this report. In many cases, the original studies were read to check on 
the reviewer's summary and conclusions, but generally we do not cite 
original references in this report. In other cases, the same study was 
discussed in more than one review — this is especially true for the more 
Important studies. This redundancy is a great help in assessing the 
amount of "bias" present in a review. By and large, we found reviews 
to be remarkably uriblased. We have tried to give a general indication 
of the excellence of various reviews and also to indicate some of the 
specific studies that are crucial. 
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In general, we relied upon a review if it summarized findings 
across studies and gave an evaluation that did not contradict our own 
evaluation of studies defined as critical. A study was considered good 
if it provided enough information by which to judge its internal validity, 
and if it did in fact appear internally valid. We have made frequent 
use of quotations, mostly to give the reader an impression of the pre- 
vailing atmosphere or to clarify a point. 

This section contains three subsections. In the first, we present 
the general results of research on teacher characteristics. We are 
primarily concerned with research that relates teachers’ skills, be- 
havior, attitudes, or personality to some measure of student achievement. 
The second subsection presents the results of research on instructional 
method. Some of this research has been conducted in the classroom, but 
most of it is from the psychological laboratory. This is the case 
particularly with studies that report positive results; most classroom 
studies are at best inconclusive. Finally, we present the results of 
research that is concerned in some way with students and their charac- 
teristics. This subsection on Interactions between students and education 
draws on research that reveals the Importance of individual characteris- 
tics to achievement. The basic theme is that students respond differ- 
entially to educational factors (teachers and instructional method) 
depending on their own characteristics — that is, there is a student- 
teacher-method interaction. To anticipate, we believe that the presence 
of these interactions is one of the more important factors brought out 
in this report. The notion of Interaction will be elaborated in some 

detail below, 

THE EFFECTS OF TEACHERS 

Studies of teacher characteristics have abounded since the 1930s 
and now number in the thousands. In spite of this large implied ex- 
penditure of time and money, little is known about what constitutes 
desirable teacher characteristics or, especially, about the influence 
of teachers on student performance. With the exception of a few receipt 
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studies, student achievement has rarely been used as a criterion, and 
therein lies the greatest weakness of research in this area. Attempts 
to use other criteria, such as supervisor or fellow- teacher ratings, 
are not successful, in that the ratings do not correlate with student 
achievement (Harris, 1969). This lack of correlation could mean either 
that the ratings are based on other indicators of success than achieve- 
ment or that supervisors and teachers do not have a good idea of what 
constitutes superior teaching. 

Past research has focused on measuring various attitudes and per- 
sonality traits of teachers, with some attempts to relate these to 
superj^isors ’ estimates of classroom success. Often, the studies simply 
intercorrelate various tests of teacher attitudes, interests, intelli- 
gence, and so forth. In the end, either these studies show contradictory 
results or the results have little practical value, and quite often 
both are true. To quote Getzel and Jackson (1963): 

For example, it is said after the usual inventory tabu- 
lation that good teachers are friendly, cheerful, sympathetic, 
and morally virtuous rather than cruel, depressed, tmsympathetic, 
and morally depraved. But when this has been said, not very 
much that is especially useful has been revealed. For what 
conceivable human interaction — and teaching implies first 
and foremost a human interaction — is not the better if the 
people involved are friendly, cheerful, sympathetic, and vir- 
tuous rather than the opposite? 

In any event, there is reason for questioning the payoff in useful 
results from studies of teacher attitudes and personality characteristics. 
Variables related to attitude and personality are difficult to define 
and more difficult to measure, especially in what is essentially a normal 
(healthy) population. Further, it seems reasonable to assume that teacher 
classroom behavior and techniques are more important than attitude or 
personality. Of course, dimensions of attitude and personality are 
reflected in the teacher’s classroom behavior (Turner and Denny, 1969), 
particularly the degree to which the behavior can be modified through 
training. However, whatever the influence of personality and attitude 
factors, it is the teacher’s classroom behavior that the student responds 
to, and it is necessary to understand how this behavior is related to 
student achievement. 
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Teacher Characteristics and Student Achievement 

In contrast to the bulk of research on teacher characteristics, 
there are a few (ten experimental and 50 correlational) recent studies 
that relate teacher classroom behavior to student acliievement. Two 
general approaches exist for studying the effects of teacher behavior 
on student achievement. The more powerful is an experimental approach 
whereby teachers are trained in a specific method, and student achieve- 
ment under this method is compared with student achievement under an 
alternative method. Studies of this type must meet all the demands of 
an experimental approach (for example, random assignment of students to 
teachers) in addition to special demands arising from the situation. 

Foremost among these special demands is the requirement for measures of 
actual classroom transactions, since only by observing the teacher can 
one be assured that the intended method was actually used. Moreover, 
data on classroom transactions are the only source of information on the 
content (rather than result) of the student-teacher relationship. Many 
studies in education lack measures of classroom transactions, and studies 
of the effectiveness of different teaching methods are rendered useless 
as a result. For example, training a teacher in a specific method is no 
assurance that the method will be used in the classroom. In an excellent 
review of research on teaching, Rosenshine and Finrst (1971) could find no 
more than ten studies that use the experimental method adequately and 
that provide data on classroom transactions. 

The more frequently used approach for relating teacher performance 
to student achievement is to correlate the two as they occur in the 
normal classroom. That is, no attempt is made to manipulate teaching 
methods experimentally. Various dimensions of teacher behavior are 
observed and rated, and these ratings are correlated with some measure 
of student achievement. The danger In this approach Is that correla- 
tional relationships can suggest false causal connections. For example, 
a high correlation between clarity of presentation and student achieve- 
ment does not mean that clarity causes high achievement. It is just as 
likely that both are the result of some other factor, say, teacher verbal 
ability or general intelligsiAce . Rosenshine and Furst (1971) find approxi- 
mately 50 studies that use the correlational procedure. 
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Studies tislng the experimental and correlational approaches have 
produced some consistent and significant results. These are summarized 
by Rosenshine and Furst, and the results are grouped according to 11 
kinds of behavior significantly correlated with achievement scores. 

Five of these are strongly supported by the research, the others not so 
strongly. The first five variables are: clarity of teacher's presenta- 

tion, variability of teacher's classroom activities, teacher enthusiasm, 
degree to which the teacher was task- or achievement-oriented or business- 
like, and student opportunity to learn criterion material. The six varia- 
bles less strongly related to student achievement are: use of student 

ideas or teacher indirectness, use of criticism, use of structuring 
comments, use of multiple, levels of discourse, probing, and perceived 
difff.culty of the course. 

At first glance, the above list of the strongest 
findings may appear to represent mere educational plati- 
tudes. Their value can be appreciated, however, only when 
they are compared to the behavioral characteristics, equally 
virtuous and "obvious," which have no t shown significant or 
consistent relationships with achievement to date . These 
variables. . .are listed belov/, and the method by which they 
V7ere assessed follows in parenthesis : nonverbaJ. approval 

(counting), praise (counting), warmth (rating), ratio of all 
indirect behaviors to all direct teacher behaviors, or the 
I/d ratio (counting) , flexibility (counting) , questions or 
interchanges classified into two types (counting), teacher 
talk (counting), student talk (counting), student partici- 
pation (rating), number of teacher-student interactions 
(counting), student absence, teacher absence, teacher time 
spent on class participation (rating) , teacher experience , 
and teacher knowledge of subject area.^ 

Rosenshine and Furst go on to discuss necessary refinements in 
future correlational studies. Of great importance is the need for more 
experimentally controlled research, with better measures of classroom 
transaction and broad indicators of outcome measures of student achieve- 
ment. Classroom studies of the effectiveness of teacher and instructional 



^See Rosenshine and Furst (1971). "Counting" refers to the number 
of times a specified behavior occurred. "Rating" refers to subjective 
estimates by a judge (teacher, student, observer) of h(W the teacher 
performs with respect to some behavior. 17he behavior is rated into a 
number of categories in terms of desirability. 
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techniques depend on the refinement and increased use of observational 
data systems. This need is commented on in many articles s, and there have 
been a number of attempts to develop and refine observational data systems 
(Bloom et_al. , 1971; Rosenshine, 1970a; Hanley, 1970). Unfortunately 
none is used widely enough or consistently enough to realize its potential 
fully. 

Teacher Skills emd Effectiveness 

The teacher's skills in the classroom are rarely determined directly; 
most investigations of teacher skills simply rely on supervisors' ratings. 
The only studies we could find that measured teacher skills directly were 
by Turner (1968) . He investigated diff(»rence3 in teacher skills and 
characteristics as a function of characteristics of school districts. 

In this study and in previous ones, ho developed Instruments for measuring 
teacher skills in diagnosing learning difficulties and organizing or 

i 

sequencing learning material in the subject areas of reading, arithmetic, 
and science. His 1968 study also included mtsasures of teacher personal- 
social factors encompassing warmth-spontaneity, classroom organization, 
educational viewpoint, emotional stability, and involvement in teaching. 

The validity of the various scales was determined by measures of in- 
ternal consistency — the degree to which teachers score consistently on 
each scale. It is important to note that validity was never determined 
on the basis of a relationship to student achievement . 

The results of the study indicate that teachers differ significantly 
in the characteristics examined, and that a relationship exists between 
the attractiveness of school districts and teacher characteristics (which 
should come as no surprise). Before making much of these results, we 
should stress that teacher characteristics must be related to student 
performance. It is of interest to know that attractive school districts 
(in terms of location, money, and students) obtain teachers who apparently 
have the more desirable characteristics. However, the important question 
is whether these characteristics make a difference in student achievement 
and, further, for what kinds of students they make a difference (if any). 
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In a later study, Turner and Denny (1969) relate the abovementioned 
teacher characteristics to student creativity, as measured ou a scale 
developed by Denny and others. In summarizing, the authors state: 

Teacher characteristics are distinctly associated with 
changes in pupil characteristics, as well as with teachers' 
behaviors in the classroom, which in turn are associated with 
changes in pupil characteristics. Specifically, the results 
reported suggest that teachers characterized as warm and 
spontaneous and teachers characterized as child-centered tend 
to obtain the greater positive changes in pupil-creativity. 

These changes appear to come about through teacher classroom 
behaviors that involve positive reinforcement of pupil responses, 
through adaptation of activities to pupils , through attention 
to individuals, and through variation in activities and 
materials . 

Unfortunately, the authors do not present their procedures or data 
in sufficient detail to allow us to evaluate their study. However, if 
the results can be replicated, the findings and method used are certainly 
important. For one thing, a measure of student outcome other than cog- 
nitive achievement was used, although the results would have been 
stronger if a measure of cognitive achievement had also been used. 

If teachers vary significantly in teaching skills and classroom 
behavior, one would expect differences in teacher effectiveness to show 
up in student achievement,. Rosenshine (1970b) provides a critical review 
of nine studies of teacher effectiveness. Four concern long-term effec- 
tiveness; of these, three measured effectiveness over a school year and 
used grade school teachers. We will discuss the results of the long- 
term studies first. 

All four studies were based on teaching the same material to dif- 
ferent students. The three studies of interest xised standardized achieve- 
ment tests that give sub test scores in various abilities or achievements 
(Stanford Reading Test, Metropolitan Achievement Tests, and others). 

The correlations (between the means of groups of students and teachers) 
obtained in these studies for the various subtests were generally around 
.35 or much lower, with one study showing a correlation of about .50 for 
two out of five subtests. The results indicate that teachers are not 
generally stable in teaching effectiveness when presenting the same 
material over time. , 



The studies of short-term effectiveness used teaching sessions of 
thirty minutes or lessj In these studies, teachers taught (1) the same 
topic to different groups of students (three studies) , (2) different 
topics to the same group of students (four studies) , or (3) different 
topics to different groups of students (four studies). In each case 
three of the studies were carried out by the same Investigator (Fortune). 
Students ranged in grade level from Head Start to the twelfth grade. 

When teachers taught the same topic to different students, the correla- 
tions (between student groups and teachers) were quite high (.22 to .70); 
but In the other two cases the correlations were extremely erratic, and 
few were significant. 

These findings, showing a lack of consistent teacher effectiveness, 
raise doubts as to the meanlngfulness of the findings of Turner and 
Denny discussed above. Although teachers may vary In skill, their 
effectiveness does not appear to be generallzable over time or topics. 
Studies of teacher skills and effectiveness are extremely limited, how- 
ever, eind any conclusion must be tentative. In addition, although it Is 
necessary to relate teacher skills and characteristics to student achieve- 
ment, there are grounds for questioning the adequacy of the measures of 
student achievement used in these studies. Teachers may be consistent 
in their effectiveness on other dimensions of education outcome, but we 
have been unable to find studies that report on this possibility. The 
lack of stability In teacher effectiveness may explain, in part, why 
studies of teacher characteristics have proven so futile — these charac- 
teristics either have no Identifiably consistent effect or are not stable 



O 
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Teacher Expectations 



Rosenthal and Jackson (1968) have reported on the Importance of 
teacher expectations as a determinant of student performance. However, 
this report has been criticized on meLhodologlcal grounds, and few of 



^Hie low correlations may result from a student- teacher-subject 
teractlon. Teachers are not equally effective with all students and 
1 topics; correlations will vary with topic and the specific charac- 
rlstlcs of the students. Also experiments based on thirty-minute 
lachlng sessions may not offer very m^ch evidence about anything relevant, 
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the results appear to be substantial. Data are lacking In these studies 
on causative factors, both on the establishment of teacher expectations 
and on the mechanisms by which teachers communicate their expectations. 
Recently, two studies Investigated some of the mechanisms Involved In 
the establishment, communication, and effect of teacher expectations. 

Rlst (1970) attempted to uncover factors that establish teachers’ 
expectations concerning students, and the effect of these expectations 
on the classroom behavior of both teachers and students. Tills study 
followed a single class of ghetto children through kindergarten and 
first and second grade. Results Indicate that In kindergarten the 
teacher’s expectations and Identification of "slow" and "fast" learners 
are essentially based on social class memt)ershlp. Data on classroom 
transactions Indicate a marked difference, In the teacher’s attitudes 
and behavior to\;ard fast and slow learners and a consequent change In 
the behavior of the slow learners. This study was based on a small 
sample and needs to be replicated. 

Brophy and Good (1970) Investigated the process by which teachers 
communicate their differential expectations to first-grade children. 
Expectations were determined by teacher ratings of students, but no 
Information is provided as to how the expectations were established. 
Results Indicate that teachers demanded better performance from children 
they rated high In their expectations , and that they praised the children 
when It was forthcoming. Teachers demanded less from children they ex- 
pected less from, and tended to withhold pral,se for good performance. 

A few other studies have attempted to verify the effect of teacher 
expectations. In general, it appears that expectations probably Influence 
teacher and student behavior and may Influence measured student achieve- 
ment. More research Is needed to follow up on the Interesting hypothesis 
of the "self-fulfilling prophecy." 

Student- teacher Interactions 

Throughout this section we have occasionally discussed indirect 
evidence that some teachers are better with some students than with 
others. Thelen (1967) reports direct evidence of this interaction and 
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outllnes a method for using it to Improve classroom behavior and out- 
come along a number of dimensions. Essentially, the method Involves 
assigning students to teachers according to the kind of student the 
teacher works with best. The method begins with teacher Identification 
of students he believes are ''getting a lot out of class" versus those 
"not getting a lot out of class." The teacher does not describe these 
students in any way but simply points them out. Different teachers 
do not tend to assign the same students to the two categories, and 

Thelen notes (p. 189): 



Finally, we found that teachers recognize four kinds of 
students: good, bad, indifferent, and sick. But the Problem 

is that each teacher places different students in these cate- 
gories, so that whatever is being judged is certainly n^ 
primarily some characteristic of the student. 



The method then establishes the characteristics of students placed in 
the two categories by the various teachers. 

In assigning students to teachers, two criteria can be used: (1) 

teachers are given students they work most effectively with, or (2) 
students are assigned to teachers they can learn from most effectively. 
This procedure requires determining the kinds of students that have 
hl^er achievement than their usual performance with a teacher, and 
then assigning teachers students of these types. Thelen's study indi- 
cates that the same student- teacher grouping would not necessarily 
result from the application of these two criteria, although there would 
be considerable overlap. In any case, however, the students are better 
off being assigned by either criterion. 

It follows not only that some teachers do better with some students 
but also that there is no single "best" or "right" way to teach. Future 
research on teaching must account for the different preferences and 
ablUtles of the teacher. It makes little sense to talk about teacher 
skills without also considering the population of students best suited 
for these skills. Studies of long-term trends in teacher effectiveness 
must designate which kinds of students the teacher is effective with, 
as well as how effective he is. The strongest evidence of an effect on 
student achievement for aiv^ educational variable appears to be that of 
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teacher expectations. Pragmatically, it may be better to put this 
characteristic of teachers to use than to oppose It or lament its 
exls tence . 

THE EFFECTS OF INSTRUCTION 

To simplify this brief overview of research on instruction, we 
will separate this subsection into two main parts. In the first, we 
examine studies of methods of Instruction primarily related to learning 
In the classroom; In the second, we review psychological research, mostly 
In the field of learning, that has direct relevance for the design of 
Instructional techniques. Studies reviewed In the first part involve 
classroom learning; those in the second Involve learning tasks that are 
dissimilar from normal classroom material, being theoretical rather than 
applied. These are studies of the laboratory type, although the laboratory 
may be a classroom. 

The distinction between the two kinds of research is based on the 
learning tasks used rather than on where the study occurs. It is an 
arbitrary distinction at best. Studies In both parts flow directly from 
the hxperlmental-learnlng tradition in psychology. There is little ref- 
erence to Individual characteristics of the learner, because of the attempt 
to devise general propositions about learning.^ 

Classroom Instruction 

We begin this part with a brief analysis of research on curriculum 
and Instruction. Currlcultim refers to instructional material and designs 
for Its use. Instruction refers to the interaction between teacher and 
student as the materials are used. We then present results of research 
on teaching machines, television, and programmed instruction. 

Curriculum and Instr u ction 

An enormous amount has been written about tarrlculiim design and 
use, Westbury (1970, p. 239) begins a review with the comment: 

^The few studies of lean:lng that attempt to account for the unique 
abilities of the learner are discussed below. / 
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Curriculum evaluation appeared as a topic of a chapter In 
three of five Issues of the 1969 Review of Educational Research . 
The emphasis on this topic Is, If nothing else, disconcerting 
to a reviewer who must plow the same field again; It Is also 
puzzling when compared with the Infrequent appearances of 
evaluations of actual curricula or curricular materials In 
either the research or the subject journals. 



and later: 



Evaluations exist In the files and reports of those who de- 
veloped curricula. Yet, while these evaluations remain In 
files, the proposals and prescriptions of developers circulate 
freely, without any readily available critical scrutiny. 

There Is a literature of currlculvim evaluation, but It Is 
neither publicly available In journals nor has It grown out 
of an accessible tradition of formal or Informal appraisal 
of curricula. There Is no "consensus of public knowledge" 
on the nature of curriculum evaluation which, warrants 
methodological formalizations about Its character or pro- 
vides the substance of such formalizations. 



The curriculum research reviewed here Is limited to literature that 
appears In the professional journals ard attempts to evaluate curricula. 
This represents only a small part of the total writings on the subject. 
The narrative writing describing curricula and discussing theoretical 
Issues Is mostly omitted, which simplifies the summary presented herein 
because evaluation has not dominated the curriculum scene by any means. 
The subset of evaluation studies Is much smaller than the set of curri- 
culum development programs. In general, evaluations have not led to 
many encouraging findings. Becausie of the complexity of the process 
they often lack sufficient scope, so that an absence of positive findings 
is not surprising. Westbury (l970» P* 245) stmmarlzes the problem of 
matching evaluation schemes to curriculum objectives : 



Two separate though Interrelated analytical problems must be 
faced: curriculum must be conceptualized In such a way that 

It no longer carries the connotation that It Is a unitary 
notion, often a treatment; evaluation must be seen In ways 
that permit the development of sets of methods and criteria 
so reasoned judgments, appropriate to all senses of curriculum, 
become possible. Curriculum evaluation theorists must attempt 
to formalize these criteria and methods so they can prescribe 
rules for the application of criteria to the fulj. range of 



concrete curricular Issues. 
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No current theoretical prescription for curricular evaluation 
approaches these goals, although parts of the problem have 
been acknowledged by some writers. 

Curriculum development programs in science and mathematics have 
been evaluated, at least in some aspects. Some of these are reviewed 
by Rombert (1969), Smith (1969), Welch (1969), and Westbury (1970). 
Evaluation studies of curricula developed by the Hiyslcal Science 
Study Committee (PSSC) , Biological Science Curriculum Study (BSCS), 
Chemical Education Materials Study (CHEM) , and School Mathematics 
Study Group (SMSG) are inconsistent In their findings. Oftentimes, 
differences between these curricula and conventional ones are small, 
and sometimes results favor the conventional method. Some interactions 
are noted between student ability and measures of learning for different 
curricula; that is, low ability students may do better in the conven- 
tional curriculum in terms of one measure of learning but poorer in a 
new curriculum. All learning measures do not disclose this interaction, 
and on some of them the new curriculum is better (Welch, 1969, p. A39). 

Westbury (1970, p. 250) summarizes a study by Heron (1969) that 
showed how a teacher's misunderstanding of a program might affect the 
program's success or failure. Heron made no attempt to evaluate cur- 
ricula in terms of output measures. Rather, the study explored three 
evaluative questions related to CHEM, PSSC, and BSCS curricula: 

(1) To what extent is the "inquiry" objective of these pro- 
grams actually embodied in the materials produced? (2) How 
do the teachers through whom the materials filter perceive 
this objective and dp they understand "inquiry" well enough 
to operationalize any conception of what it might; mean in 
their classrooms? and (3) How does this objective compare to 
the explicit tod illicit goal teachers set in their classrooms? 

Westbury summarizes the findings:' v 

The results of his application were disappointing. Despite 
the claims . of the developers for their materials , they were 
found to present little more than a "somewhat sophisticated" 
version of a "less competent” view of method. The teachers 
who had been attending workshops on the new materials wer 
found to have almost no conception of what might be meant 
by a claim to teadi the "nature of scientific inquiry." v. i 
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The innovated science curricula such as those discussed above place 
heavy emphasis on the role of inquiry or learning by discovery, an 
emphasis that Ausubel (1965, p. 259) has severely criticized; 

Much of this "heuristics of discovery" orientation to the 
teaching of science is implied by the view that the principal 
obiectives of science instruction are the acquisition of 
general inquiry skills, appropriate attitudes ^out science 
and training in the operations of discovery. Implicit or ex- 
plicit in this approach is the belief that tne parl.icula 
choice of siibject matter chosen to implement these goals is a 
matter of indifference (as long as it is suitable for the 
operations of inquiry), or that somehow in the course of 
performing a series of unrelated experiments in depth, the 
learner acquires all of the really important subject matter 
he needs to know . 



Later in this section we discuss theories of instructional organi- 
zation (including Ausubel's). These approaches emphasize the importance 
of instructional structure in acquiring knowledge. It is not surprising 
then that Ausubel should conclude that incidental learning as a by- 
product of discovery cannot compare to a graded and systematically 

organized approach. 

The idea of learning by discovery has become a popular one through- 
out education, particularly among those calling for reforms in classroom 
teaching. The complex issues involved in this concept are the topic of 
an excellent book edited by Shulman and Keisler (1966). The book empha 
sizes that learning by discovery does not mean laissez-faire education. 
The difference is in the way control is exerted, not the lack of it. 

In general, learning by discovery has not been proved to have a great 
advantage over conventional methods. Cronbach (1966) points out that 
research is needed to determine what "advantages learning by discovery 
offers; ind unddt what conditions its benefits are accrued. 

Although curj-iculu^ far frombeing on firm ground, 

arid in spiti'of &}:geheral lack of evaluation, some progress is being 
made. The current status of curriculum development ^d evaluation in 
terms of its accomplishment and shortcomings is seenjin the following 

quotations: ' jj 
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In brief sunmary, during the past decade significant progress 
has been made in the precise definition of curricular objectives, 
in the analysis of ends/means relationships, and in the effec- 
tive ordering of stimuli for learning. Substantial progress 
has been made in extending both the understanding of the 
evaluative process and the use of evaluative data in diag- 
nosing the possible causes of discrepancies between curricular 
expectancies and curricular accomplishments. In the realm of 
explaining curricular realities, however, we appear to know 
little more in 1969 than we knew in 1960. Curricular theory 
with exploratory and predictive power is virtually nonexistent. 
Goodlad (1969, p. 37A). 

Research during the period of this review shows a desirable 
tendency toward a broader spectrum of concern, but still 
lacking are systematic longitudinal studies showing the im- 
pact of varied methods and materials on student attitudes, 
understanding, performance, and motivation. Current research 
seems to be mainly discipline- centered rather than pupil- or 
learning-centered, and the ends of education appear to be too 
often siibordinated to transitory fashions in educational 
haberdashery. Smith (1969, p. A09). 

One conclusion seems obvious . Only at centers where there 
has been a concentrated effort to investigate many facets of 
a course or teaching method by a group of researchers does 
one find any discernible evidence of advancement. Welch (1969, 
p. A41) . 

Theory must Inform the deliberation that is evaluation but 
at the same time it must groiT from deliberation. The problem 
Implicit in this assertion is mapped by tlie requirement that 
curriculum and evaluation workers find a theoretical structure 
that permits them to embrace the particular and concrete with 
seriousness before they attempt theoretical speculation of 
any kind. We are far from this at the moment. Westbury (1970, 
p. 257). 

Rosenshlne (1970a) indicates that a central problem in evaluation 
is determining the actual teaching practice that takes place within any 
given currlcxxlum. Because teachers vary widely in their skills , atti- 
tudes, beliefs, and dispositions, they do not all do the same thing given 
the same curriculum. 'Simply producing a curriculum does nothing in 
terms of its implementation, and evaluations of different curricula are 
generally useless without data oh classroom transactions. In summarizing 
the shortcomings of evaluation of curricula, Rosenshlne states: 
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rurrentlY three major needs are: greater specification of 

the teaching strategies to be used with instruction^ materi^s, 

improved observational instruments that attend to the context 

of the interactions and describe classroom interactions 

more appropriate units than frequency counts, 

into the relationship between classroom events and student 

outcoiHB raeasuTBS (p« 296)* 



Some progress is being made in defining classroom transaction and 
relating it to student outcome. Some studies that relate specifically 
to the teacher’s mode of presentation were discussed previously; how- 
ever. as yet there is little demonstrable evidence for accepting any 
particular curriculum as being better than another. This is a gross 
generalization and perhaps does not do credit to some programs. Of 
course, some curricula are undoubtedly better than others and "everyone 

knows it." Unfortunately, demonstrating curriculum effectiveness is 

extremely difficult. 



Instructional method studies have failed for essentially the same 
reason as currlculvna studies : a lack of classroom transaction data. 

Keported studies find no consistent Indication for the superiority of 
any instructional method. For example, research on discussion versus 
lecture has a long history, but as Stephens (1967, p. 81) concludes: 

"It has been found in suaraary after siiamary that no distinction between 

the two methods can be found." 

studies of instructlona amthod rarely control for student or teacher 
characteristics, and it is entirely possible that one «thod may be 
superior to another for some students and with some teachers. It is 
unreasonable to assume, for example, that all teachers are equal y 
effective using the^ discussion method, or that because one is effective 

using the discussion amthod, he vrtll ^so be effect!™ using the Iscture 
approach. .Before instructlM^ methods, can be evaluated, certain student 

and teacher characteristics must be defined, and data most be provided 



on 



the, tr.ansacticms be^tween them.. 







Television and Programmed Instruction 

We turn now to the topic of teaching machines, programmed instruc- 
tion, and other technologically oriented aspects of instruction. The 
research on teaching technology has been much reviewed, and only the 
major studies will be mentioned A detailed and lengthy history as well 
as a critical and summary review of such research is provided by Saettler 
(1968). A brief overview of history and research including comments on 
general shortcomings of the field is given by W. H. Allen (1971). A 
lengthy evaluation and review of research on learning from television 
is provided by Chu and Schramm (1967). A number of other reviews of 
specific areas will be cited in the following pages. 



The early and intense interest in television learning led to a 
large-scale development with little in the way of controlled research. 
Man y claims Were made for the success of these, programs . Subsequent 
research did not support the claims, although as Chu and Schramm (1967, 



p. 176) point out: 



In a sense, instructional television is more complex than the 
research that deals with it. Complex behavior has baffled 
learning theorists for years, A number of variables are 
clearly at work determining what a given individual learns 
from the television. In many cases these variables inter- 
act, and the total must be a great deal more complex than can 
be represented by the one variable experiments that typically 
make up the research literature, no matter how clean and skill- 
ful they are. 

However, after hundreds of studies, it can only be concluded chat 
leaming'by television is about as effective as conventional classroom 
learning, and a case cannot be made for the superiority of either. Effec 
tive television teaching grows out of the application of sound teaching 



methods, such as simplicity, organization of material, and practice, and 
apparently not from any special mode of presentation. The advantages of 
television learning are not e^dent in any identifiably superior result, 
but rather in the d>ility to reach a larger audience and to augment con- 



ventional methods . Further research is required to determine under w!) at 
conditiOTk ‘ teleyisi<m leaTO^^^ 

television /presentation ;are;>resppnsible fpri: learning . However^,' the same 
comment holds for conventional teaching. In general, little is known 
about factors that actually promote- learning. ' 



The most direct application of learning principles has been in the 
area of programmed instruction. This literature is reviewed in many 
places and is commented on in almost every review of educational re- 
search. Interest in programmed instruction, which surged a little over 
a decade ago, has waned considerably over the past five years (Corey, 
1967). The conditioning approach of Skinner (1968), following his sue 
cess in conditioning the behavior in animals, has been applied to human 
learning. In spite of the early bloom and rapid spread of programmed 
instruction based on the Skinnerian method, however, later evaluations 
of the effectiveness of programmed instruction have not been highly 



positive. 



The behavioristic learning approach of Skinner and his followers 
was criticized early in its development on the grounds that, because 
their teaching practices derived from work with animals, programmed 
instructions were devoid of meaningful structure and concentrated too 
much on rote material. The Skinnerian approach thus has many critics; 
some criticisms relating specifically to programmed instruction can be 
found in Pressey (1963) and Thelen (1963a,b). 



Theoretical issues aside, programmed instruction has not proved to 
be the success in the classroom that it was first thought to be (Gotkin 
and MeSweeney, 1967; Saettler, 1968; Allen, 1971). Programmed instruc- 
tion is about as effective as conventional programs when student achieve- 
ment is used as the criterion, but its superiority has not been affirmed. 
The issue of effectiveness of programmed instruction is further clouded 
by the untested claims made by the manufacturers of teaching machines 
(Saettler, 1968, p. 269). Few, if any, of the claims made for the hi^ 
efficiencj' of teaching machines have in fact proved out. An early such 

claim held that by properly sequencing material in small steps, dull 



students wo^d be able to perfom better, perhaps even as well as bright 



^Of course, the Skinnerian stimulus-response approach drev^ i^^ 
fire from the gestalt psychologists, jAio insisted on a^ fie Id^approa A 
with emphasis ^ neaningful u^ fragmented, serially pre 

s anted (and rote-learned) programs. 
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students. However, In their review of this research, Cronbach and Snow 
(1969) could find no evidence to support these claims. 

In summary, there Is no support for the claim that programmed In- 
struction Is superior to conventional classroom methods, and this probably 
explains the recent decline In research on the topic. However, Interest 
In programmed Instruction and teaching machines has had some positive 
outcomes. A book on programmed Instruction edited by Lang (1967), al- 
though It has little to say about programmed Instruction as It applies 
I to teaching machines, discusses the design, structuring, and sequencing 

' of learning material for any mode of presentation and Includes problems 

of curriculum design. Allen (1971) points out that research on programmed 
Instruction has had the Important effect of producing Interest In the 
development of Individualized instruction. VRiereas early research and 
application focused on group Instruction and one-way communication, the 
current work Is shifting to the unique characteristics of the Individual 
student as a central Issue In the design of Instruction. Interest Is 
turning, however slowly, to the study of Interactions among student, 
task, and material. 

■‘i 

3 

'■'i . . 

Experimental Work In Instruction 



Organizing psychological research and making it relevant to Instruc- 
tion Is an enormous job and perhaps even an Imposslb le one . The size of 
the problem has been well put by Gagnd and Rohwer (1969, p. 381): 



'■! 



o 

ERIC 



Remoteness of applicability to Instruction, we note with some 
regret, characterizes many studies of hxanan learning, reten- 
tion, and transfer, appearing in the most prestigious of 
psychological journals. The findings of many studies of 
human 'earning presently cannot be applied directly to in- 
structional design for two major reasons: (a) the conditions 

under which the learning Is Investigated, such , as withholding 
knowledge of leaning goals from the subject and the reqtiirlng 
6f repetition of responses /are often unrepresentative of 
conditions under, which most human learning occurs; and (b) 
the tasks set for the learner (e .g. , the verbatim reproduc- 
tion of verbal responses, the guessing of stimulus attributes 
chosen by the experimenter, ainong many others) appear to 
cover a range from the ittBreiy pieculiar to the downright 
esoteric. This is not such studies do not, 

further an understanding of the learning process. Hoi^ever, 
it would seem that extensive theory development centering 

I'i ■ ' i \f 








mm 



-70- 



up on learning tasks and learning conditions will be required 
before one will be able to apply such knowledge to the design 
of instruction for representative htnnan tasks. 



Much of the reason for the gap between experiments on learning in 
the laboratory on one hand, and classroom applications on the other, lies 
in the influence of behaviorism and its emphasis (real or implied) on 
association learning. The behavioristic tradition in general, of course, 
has always had its critics. The psycholinguists, led by Chomsky (1959), 
have leveled strong criticisms, and the debate continues. The basis of 
behaviorism is the stimulus-response relationship and its control through 
the manipulation of reinforcement. The inadequacy of this model even in 
simple animal learning has been questioned repeatedly , and Its applica- 
tion to human learning (particularly verbal) is considered by many to be 
grossly inadequate (Deese, 1969; Garrett and Fodor, 1968). Nevertheless, 
behaviorism dominates in learning and experimental psychology, and the 
methods used in studies of learning are almost exclusively those of be- 
havioristic inclination. Some examples of widely used methods are sum- 
marized below . 



Studies of human association learning typically present pairs of 



stimuU (words, symbols, pictures) to the subject during the learning 
phase, and test for his recall of the second stimulus by presenting him 
with the first. A recognition measure of retention (or learning) may be 
used in whidh the subject selects the correct stimulus out of several 
presented to him. An even more primitive form (serial learning) simply 
presents stimuU in lists; learning is measured by the degree of recall 
(or recognition) of the list. In the study of human laaming, hundreds of 
laboratory studies involving serial and association learning occur each 

year, but the value of studies of paired-ass(xiate learning for the class- 
room has been repeatedly questioned, and it is generally concluded that 

-trill IIP et al . ( 1971 ) cautlon gainst this con- 



clusion, beciiise substantial relationships have been reported between ' 
paired-associate ^^d, school learning . : r. f ;' /: 

Another frequently used method for studio of hum^ learning is 



discrimination learning ; In this method , the sub j ect learns to make 
a differential response to different stimu|l;t^ application 
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of a reinforcer. Usually there are two stimuli and two responses . For 
example, the subject may be reinforced (with a reward or with feedback 
concerning the correctness of his response) for responding to one 
stimulus, and not reinforced for responding to another. Learning is 
measured in terms of the time or number of responses necessary for the 
subject to "learn" to respond only to the "correct" (the reinforced) 
stimulus. This method may make use of an irrelevant stimulus (one 
present but one not necessarily attended to by the subject) { the subject 
is then tested for how well he "correctly" responds to this incidental 
stimulus (incidental learning) . 



At least two excellent reviews of learning research are now availa- 
ble; Anderson (1967) and Gagnd ^ahd Rohwer (1969). Both reviews organize 
a large variety pf research around a few central Issues, and both evaluate 
as well as summarize the research as it relates to these issues. In 
addition, we have made a brief review of major articles piiblished since 
Gagnd and Rohwer. We will discuss and review those activities and issues 
tliat appear to be most Immediately relevant to instruction in the class- 
room. No reference will be made to specific studies except for those not 
Included in Anderson and In Cagnd and Rohwer. 



Transfer of Learning 

A central issue in learning theory and a critical one in classroom 
learning is that of transfer or generalization of learning, A dis- 
appointment of pre-school and compensatory education programs has been 
the fading of achievement gains over time. This has led to an interest 
in the question of how ; achievement in basic skills such t.* reading and 
mathematics might be generalized to future achievement and to concurrent 
achievement in other school subjects . Hwreyer , there appear to be no 
direct attempts to measure this generalization in the classrcwm. 

Althou^ we lack studies in the classroom, the psychological 
research on gqnerallz^ion (ref erred - 

(1962) d<Hf^«gydsh^?^o /kinds of ^transfer ; ^ Inionedasef - there is ^ 
from the learning df n; 8pe^ 

o^.. go«oriar rla«atQf tasks.; He ternsi; this lateral trduidfe^ Is 

equivalent to iener^zation. ‘In other words , v ^iieralizatibn operates ;/ 
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whenever two learning problems require common rules for solution, or 
depend on some common stimulus or response sequences. 

A second kind of transfer -- vertical transfer -- operates when the 
learning of a specific task facilitates the learning of another. For 
example, training In stimulus coding — that Is, a translation of mean- 
ingless symbols into meaningful ones via nnemonlc devices — transfers 
to paired-associate learning. Sxibjects trained In coding learn faster. 

In this case, stimulus coding Is a subordinate skill to paired-associate 
learning, however. It Is not necessary to, or a part of, the learning 
task. This is the kind of transfer that Gagnd and others consider In 
studies of "hierarchical organization," where learning a task lower In 
the hierarchy facilitates the learning of hi^er-order tasks. 

Lateral transfer is a less popular research topic (Gagn€ and Rohwer, 
1969). Results of recent studies hold no surprises. Much of the re- 
search on lateral transfer has centered on learning general rules. 
Research shows that verbalizing the rule Is better than not, and using 
a wide variety of examples of the rule in the learning phase helps to 
promote transfer. 

Studies of vertical transfer carry a number of important implica- 
tions for the design of Instruction. The notion of hierarchical organi- 
zation was first outlined In detail by Gagnd (1962). He asserts that 
knowledge of a siibject can be arranged In a hierarchy such that knowledge 
at any one level of complexity depends upon the attainment of knowledge 
lower In the hierarchy. Theory predicts that In learning a subject 
students cannot "pass" a post-test on the subject unless they also have 
"passed" tests for skills lower in the hierarchy of knowledge. 

A nunher of studies desired to test for hierarchical theory report 
results supporting the theo^. In a recent review, the originator of 
' the theory comments that : ^ ^ ^ 

Studies of transfer of prior learning are frequently con- 

sis tent with this hypothesis , although few are confirmliig 

In a cruel ^ sense (Gagn6 ^d Rohwer, 1969). 

Ausubel (1963) has developed a theory of hierarchical organization 
I of meaningful veihal materlal^^^ hierarchy begins at the bottom with 



detailed and specific bits of knowledge and builds to a level containing 
the most abstract and general concepts. The learning of new material 
can be facilitated by the use of "advance organizers," which help the 
learner Integrate new material into his existing cognitive structures. 
These advance organizers are highly generalized statements or questions 
that the subject reads prior to studying new material. Their purpose 
is to prepare the reader for new material in terms of what he already 
knows; or the advance organizers may outline and brief the material. 

In addition to experimental support cited by Ausubel, several other 
studies also find supportive evidence for the theory (Grotelueschen and 
Sjorgren, 1968; D. I. Allen, 1970; Merrill and Stolurow, 1966; Merrill, 
Barton, and Wood, 1970).^ 

A topic closely related to transfer Involves a technique that has 
come to be called "fading" or "vanishing." In this technique one stimulus 
is faded out and slowly replaced by a new one. Anderson (1967) reports 
on research in this area that may have practical value for teaching 
children who cannot understand or hear verbal Instructions. In this 
technique, the students are able to learn to make the correct response 
to the new stimulus without trial-and-error behavior. A recent study 
by Karraker and Doke (1970) found the fading technique to be superior 
for the errorless learning of the discrimination of the letters b and d 
by kindergarten children. However, Samuels (1970) summarizes reading 
research xislng the fading technique and finds contradictory results . 

In these studies, a picture and word a:re shown together, and the picture 
Is gradually faded out. - It appears that the desired attention shift 
from the picture to the word does not always take place. In view of 
the contradictory evidence and the limitations of this technique. It 
appears to have little utility In the classroom at this time. 

Vertical transfer has been studied under a number of other theories 
and eiqterlmental approaches, Inclvidlng rule learning, concept learning and 
attainment (see discussion of the work by Piaget), verbal learning, and 
problem solving . The results of /the many studies on transfer clearly 
Indicate vhe Importance of the sequence of t^ks for Instruction effec- 
tiveness. These results appear to have more direct bearing on classroom 
learning than any others we have reviewed, although much more needs to 
be known. 



Reinforcement and Feedback 



The concept of reinforcement Is central In almost all formulations of 
learning y and many learning tVieorlsts and experimentalists Insist that 
learning cannot occur without reinforcement. In the absence of a clearly 
defined external relnforcer, these theorists assume that reinforcement Is 
provided by the subject and Is Internal.^ For example, a siibject may be 
reinforced with some tangible reward for reading, or he may read because he 
finds It personally rewarding. The latter Is considered to be a case of 
intrinsic reinforcement. Other learning is said to take place as a result 
of the operation of social relnforcers or broadly generalized extrinsic ones 

The Importance of relnforcers for learning has been demonstrated In 
the laboratory, but using the strictest definition of reinforcement. If a 
stimulus presented Immediately after the occurrence of a response leads to 
an Increase In the response rate. It Is a relnforcer. It Is frequently 
argued that the use of this rigorous definition of a relnforcer In complex 
learning Is at best unproductive. The stimulus properties of the rein 
forcer are not knoViTi, nor Is the desired response clearly defined. 



Psychologists have, however, tried to address complex learning. 

For ex^le, the term "feedback” has been used by some psychologists to 
Indicate an information-processing and volitional aspect of complex 
learning. It Is a general term that may be used to denote either the 
reinforcing event, the subject's Interest In and use of the eyent, or 
both. Obtaining a penny (or candy) reward for the correct respoMe in 
a discrimination learning task may be thought of as providing feedba<^ 
about the correctness of response and defining how the subject era 
obtain further reward. Providing knowledge of results to the learner 
(feedback) is considered by ^y theorists to be a relnforcer for 
wanting to learn, rad the reinforcing event is primarily intrinsic, 
although under partial control of the. external eyent. 



Although studieai of various 7fact6irs of relhf or cein^^ dominated 

much of the psychological study of learning, it appears, to us that fw 






^If carried to an extreme , reinforcementi becomes ;■ a tautological 
concept — that Is, if an external relnforcer is absent then the theorist 
defines some internal reinforcement .„to account for learning. 



^ 'll 
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of the results have any real value In determining classroom learning. 

The application of a term like feedback does not solve Issues, because 
It Is difficult to find consistent rules of feedback. Gagnd and Rohwer 
(1969, p. 401) note that: 

A characteristic of recent research Is that it reveals 
clearly the highly varladjle nature of feedback effects. 

Moreover, the research Indicates that the sources of this 
variance are to be found in learner characteristics, type 
of feedback, timing of feedback, direction of feedback, and 
type of task. 

At tention Factors In Learning 

For learning to occur, the appropriate stimuli must be attended by 
the learner, and factors In attention have played a central role In 
learning experiments. There Is a well-established body of research to 
Indicate that stimulus novelty promotes learning and helps to maintain 
attention. In human learning, guessing and delayed feedback lead to 
better leainalng than no guessing and Immediate feedback. In general, 
factors that Increase the uncertainty of a stlmultis complex lead to 
heightened curiosity or Increased attention. In reading material. It 
has been fotmd that retention Is Improved when questions are inserted 
throughout the text. These results are generally interpreted as Indi- 
cating Increased attention and Inspection time. 

One of the more easily manipulated factors In Instruction Is the 
mode of presentation of the learning material. In stmimarlzing research 
on stimulus presentation, Gagn€ and Rohwer (1969) state: 

Considerable evidence has now been amassed indicating that 
when there is a choice of method for presenting equivalent 
Inf oinnatlpn, the following results prevail: pictorial materials 

are superior to verbal; concrete verbal materials are pre- 
ferable to abstract verbal; and grammatically structured are 
better than .tmstructxired materials. In contrast, the condlr 
tlons that might dictate <diolces among various avj^lable 
modes of presenting stimuli are almost entirely undetermined 
thus far. Finally, stimulus context appears to be one of the 
most potent of. the variables determining the effects of 
materials presented, althou^ tasks other thim traditional - 
laboratoi^ ones remain to be; investigated. :: ' ■ 

Research that finds pictures superior to words Is mostly based oh 
the paired-associate method. These studies typically require the subject 
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„ leam lists of paired words, paired pictures, or a word paired with a 
picture. Although results favor the picture presentation, the relative 
effectiveness of the two modes appears to depend on a number of factors, 
including student characteristics and age and task characteristics. How- 
ever. Samuels (1970) finds on the basis of studies of classroom learning 
that pictures have a neeatlve effect on leamlnfi to read, especially , fox 
eue poorer students . Pictures are interpreted by Samuels as distracting 
stimuli that produce attention shifts. This Is cormlstant with other 
findings about the effect of distracting stimuli on learning by poor 
students . The studies reported by Samuels involved young children 
learning to read, while most of the studies using the paired-associate 
method used older subjects. Thus, age differences may account for the, 

disparate results obtained by the different methods. 

Retention of Learned Ma terial 

Once material has Veen learned, a key question la the length of 
retention. Studies of retention and forgetting are as old as the study 
of learning, and one of the principal measures of learning has always 
been amount of retention. Gaged and Rohwer (1969. p. 401) give an 
excellent review of the research, the prlnapal findings, and the basic 
issues involved. Unfortunately, like much of the research reviewed in 
this section, work on retention depends on the paired-associate method, 
which makes generalization to the classroom hazardous. 

Earlier studies that seemed to demonstrate better retention for 
free recall than lor recognition learning have since been shown to 
depend on the degree of original learning rather than the method of 
learning. A number of studies have confirmed that when control is 
introduced for the degree of original learning, retention Is approxi- 
mately the same for all methods of learning (within the limitations of 
paired-associate learning) . Even the degree of meanlngfulness of the 
material does not affect retention when the degree of learning Is c<m- 

trolled for. Of course, "maaningfulness" here Is used strictly in t e 

framework of paired-associate learning, where it refers to the use of 
words instead of nonsense syllables, or the use of grammatically correct 
sentences Instead of random orders of words. This does not seem to be 
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closely related to what educators generally mean when they talk about 
meaningful material. 

Other factors affecting retention have been Isolated. The effect 
of retroactive inhibition is well known. This occurs when a learning 
task inserted between the learning of an original task and the measure 
of retention causes the original material to be forgotten. It has also 
been found that elaborating on (talking about) the stimuli in the learning 
phase promotes retention. 

STUIiENT CHARACTERISTICS 

In this subsection, we discuss evidence showing that a general 
failure to match student characteristics with specific educational pro- 
grams is a major reason for the lack of positive findings in educational 
research and for the consequent lack of success in defining factors that 
substantially affect educational outcomes. Little attention has been paid 
in the literature to identifying pertinent student characteristics or 
to developing specific educational programs to fit individual charactar- 
istics. A priori, it seems reasonable to believe that students respond 
differently to different kinds of classroom and instructional methods 
and to different types of teachers. As reasonable as this hypothesis may 
sound, there is little research to support it, although some notable 
exceptions are pointed out below'. ; - • 

Although there are undoubtedly many social reasons why individual 
student differences have not been a major part of research, it is worth 
noting psychological reasons. Cironbach (1957) has pointed out that 
psy‘chology is split into two disciplines. One group of psychologists 
(mostly psych ometriciahs and , to some extent; personality theorists) 
have been greatly concerned with individual differences and have mostly 
ignored the development of a gSheral theory of behavior. Others (notably 
learning theorists and experimental psychologists) have attempted to 
develop theories of behavior while ignoring individuai differences. 

This split has been part icul ar ly damaging to education , because learning 
theorists have little to say that bears directly on learning in the 
classroom. Gagnd ;(1967, p. 13) notes; 
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First the widespread, inattention to individual differ- 
ences seems to indicate that psychologists have been uniquely 
optimistic in their expectations for the generality of 
behavioral laws. In the pursuit of these laws, the assess- 
ment of ranges of generalization and of limitinjj!; conditions 
has been by-passed. If we recognize learning ap a process 
of transition from an initial state to an arbitrary terminal 
state, then with respect to the individual differences prob- 
lem, vre should take a lesson from other natural sciences. We 
must recognize limitations in the applicability of a scientific 
law. It is throu]^ the specification of limiting conditions 
that our hypothesized or theoretically derived relationships 
obtain concreteness. 



Abilities and General Intelligence 

The study of human abilities has long been an area of psychological 
research concerned with individual characteristics. Alternative theories 
and the experimental literature generated by this effort have been dis- 
cussed ih,a nunber of places (Guildford, 1967; Cronbach and Snow, 1969; 
Snow, 197;p. The most widely accepted theories identify some kind of 
general aljility (general intelligence) and a number of special abilities. 

The relative influence of heredity and environment on th^ develop- 
ment of alii lity is a topic of continued interest and debate. Some « 
theories hold that abilities are genetically determined, unfolding in 
the process of development. Others maintain in varying degrees that 
abilities are learned and that heredity only places loosely defined 
botindariea’ on their development. Snow (1971) comments ; . ,\ 

( ■ The bulk of the evidence' seems to be against the 
imfqliding hypothesis ,; but A learning ^ 

thesis .remai’M^ ^^1^ 

The i»s t recent upsurge of interes t , in . genetic ; de te r^n^ ts, of in- 
tellectuai abiUty wm prop)ted by ,,the. work of . J^ , who .reports 

on' the ini:eraction of two broad categories of ^ility , (Level I and, II, ... 
to use his terMnology) and types of learning (Msocia and cpncuptqal) 
His findings] and his interpretation _ in terms of heredity are a matter of 
much controversy ; more , researxb is needed befp conclusion cap 

. ■■ . .li 

■ ^ ■ 
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be made. In particular, the effect of "tuning"^ on students loV7 on tests 
of Level II ability must be investigated, since there are subjects who 
have had little exposure to, or use for, conceptual thinking. 

\ 

I In a well-designed study by Rohwer et al . (1971), several hypotheses 

deriving from the Jensen model are investigated. Some results support 
the model and some conflict with it. The authors present aj.i alternative 
explanation, one independent of an assumption of differences in Innate 
ability between populations. The study nakes clear that part of the 
problem in verifying Jensen's model lies in the fact that Level I and 
II tasks are not readily defined. 



1 



EKIC 



Although the relative contributions of heredity and environment 
are not known, there is confirmatory evidence of differences in general 
cognitive abilities between etlinlc groups. Stodolsky and Lesser (1967) 
review the evidence for this conclusloii and report on their own care- 
fully controlled study. In their study, they find highly significant 
differences in patterns of achievement across four mental abilities 
(verbal, reasoning, number, and space) for various, ethnic groups (Chinese, 
Jews, Negroes, Puerto Ricans). That is, the level of attainment in each 
of the four abilities varied within an ethnic groups, but ethnic groups 
differed in terms of which ability they attained best. Differences were 
also found for lower- and middle-class children within an ethnic group, 
and while the patterns were very different for different ethnic groups, 
they were nearly identic^ for the two classes within an ethnic group. 
Thus, whatever factors produce differences in ethnic patterris of i^htal 
performance operate in both lower aii’d middle classes . ' Tlie alutlio^^^ feel • ' 
that more research is hecessary to determine the specific ante'ced^ 
the differential patterns of mehfai'^^ : j ^ 



.7 

// . 
• // ■■ 






Some s tudents have little or no practice in the use 



of mediation 

or the sear ch' for general principles in problem ' solving 1 Ihus , they 
do poorly in labstract or .conceptufid..; problem solving compared v;rlth, children 
who come from an environment tihat encourages thie use of mediation. It is 
concluded that the poor perforaersl'w to do conceptual thinking. 

7 'Tuning is -a. prcT? training ln..which .siAj^cts 
tl on. Differences between groups Often disappear after tuning is used. 
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Some recent successful attempts to improve the IQ scores of Negro 
ghetto children argue against a genetic explanation of the children's 
generally lower scores. Through work with parents, some recent attempts 
to modify IQ in pre-school children show promise, as do some programs that 
focus on language learning (see Elkind and Sameroff, 1970, for a review 
of these studies) . Two recent programs beginning with pre-school children 
show promise: one at the University of Illinois (Engelmann, 1970) and 

the Ypsilanti-Camegie Project (Lamb ie and Weikart, 1970). The Illinois 
programs, especially, demonstrated substantial gains in IQ scores and 
school achievement. However, past studies have shown a decline over time 
of IQ gains resulting from special programs, so that one needs to know ^ 
longitudinal effects before making a final evaluation on these programs. 



The above studies are examples of success in identifying special 



abilities. The important question for this report, however, is how these 
abilities affect educational outcome. Studies that investigate the effect 
of special abilities on learning have been summarized aid evaluated in a 
number of places (Ferguson, 1965; Fleishman and Bartlett, 1969). However, 
Cronbach and Snow (1969) find serious methodological flaws in much of the 
research and conclude that there is little clear evidence for the assump- 
tion of an interaction between special abilities and learning, l^is is, 
not; meant to imply that specific abilities do not affect educational | 
outcome, but that their utility for dif fere^iating success with particular 
teaching methods has not been adequately demonstrated. 

Whether general intelligence (or general ability) is related to 

learning is a matter of some controversy. Evidence from factor analytic 
studies indicating that intelligence is not a unitarjr Aility, and low 
correlations from stiidies of IQ and learning and among s^e^^^ 
tasks led Fleishman and Bartlett (1969) to favor an interpretation that ;j 
does not define intelligence as the ability to learn. Cronbach and 



SnovP (1969) take Issue with this p^int of view; after reviewing Mseard^, 
arid |re-anaJ^zing '^some of frie existing .data , they ^ that|pneral 

intii^ligence ia consistently and sribstkntially jc 

Muchi of the confusion , according to these authors ,^^ari^ fromi^ the fact • 



:! IJ- 



, See Stearns' (19 71b) or a detailed reviw^ of the literat'pre on 
the effects of preschool pirograMf In raising childrenj s IQ. 
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that many studies of the relationship between Intelligence and learning 
use laboratory tasks that do not allow general Intelligence much effect. 

In addition,’ most of the support for special £d>llltles comes from the 
factor analytic approach that dominated American research on abilities 
for several decades. This approach tends to overdifferentiate, because 
even slight correlations sometimes produce new factors; in the process, 
a general intelligence factor tends to be submerged. British researchers 
have used a hierarchical model of abilities (Vernon, 1965). The views 
of Croiibach and Snow (1969) are more consistent with the British approach. 

Along with the finding that general Intelligence Is correlated with 
degree of learning, Cronbach and Snow (1969) report evidence of signi- 
ficant and substantial Interactions between Intelligence and Instructional 
method (aptitude- treatment interaction). In other words, Instructional 
methods and learning tasks can be found that are differentially effective 
on the basis of level of student general ability. For example, under 
Instructional methods A and B, an Interaction effect means that if higih- 
ablllty students do relatively well under treatment A, they do relatively 
poorly under B. Conversely, low-ability students do relatively well 
under B and poorly under A. If groups of students given methods A and 
B are equally mixed In regard to ability, no difference will be found 
in average performance between the two \|Kthods. This is believed to 
explain much of the failure in educational research to find positive 
effects due to instructional innovation. The kinds of educational treat- 
ments that will produce an interaction with general ability are hot well 
understood, but some suggestive possibilities can be brought out in the 
following pages. 

If we grant an Interaction between educational method and student 
intelligence, then to maximize achle^vement students should receive dif- 
ferential instructional treatment (at least in some; topics) on the basis 
of intelligence. However, classroom grouping (by intelligence or «my 
other ability) has a long history of failure in promoting any difference 
in achievement outcome. Thelen (1967) reviews the extensive research 
on grouping and summarizes the findings of the international conference 
on grouping at the UNESCO Institute of Mucation i^ Hamburg in 
Results clearly indicate that heterogeneous groups do about as well, as 

«t-922 O - 72 -:7 ■ iil '' 
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homogeneoui. groups. The reason for this seems obvious. Grouping on 
any basis, by Itself, cannot be expected to produce Improvement. What 
Is needed Is differentia Instruction treatment of the separate groups 



(Thelen, 1967, p. 188)? 

In other words, speaa grouping^ ^es seme 
the teacher has a clear and accurate Idea of what to do with 

the specia group. From this ^“fses 

culty with homogeneous ablUty grouping Is 

u 1 f-rt Hpnl with the group are often wrong, inus, we 
Sn^te^e^s wto tilnAright" children "ought" to be ™re 
self-arectlng, more Interested In the ’ 

or more eager to have a continuous , eavy with reeard 

and large, however accurate these guesses may be with regard 

to hnprLsions of bright adults who ^^ilnlv 

adult world, the guesses are mostly not t^e 
not necessarily true — as applied to most bri^t childr 
under usual school conditions . 



student Cbaracterlstlcs and P r ogrammed Instructio n 

In the past decade, there has been much Interest In programmed 
Instruction and the application of what are sometimes referred to as 
principles of learning theory. The Interest In programmed Instruction 
derived almoet entirely from the psychological fteld of leamlngi as 
mentioned earlier, this discipline was not oriented toward accounting 
for individual differences. For that reason, most of the^research on 
instructional nmthods, especially programmed Instruction, has not 
focused on (or even considered) individual characteristics. Most of 

the research on instructional methods has been reviewed above. Here we 

„iU su-ariae the findings of studies that hove attempted to Investi- 
gate response to progr^d Inetructlon as a function of student 

V characteristics . 

Crdnbach and Sncjw (19M study in this area that is 

exceptional in its sophiitlcitloi and that leads to an Interesting ■ 

hypotOdiii in neeO of 'further 






if: 



'!•*•» S’”*? 

of his response. ‘ (See Corey, 
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by Burton and Goldberg (1962), are too complex to present here. The 
essential finding was an interaction between treatment (type of feed- 
back) and student aptltrle (verbal reasoning), but the Interaction 
reversed according to the difficulty of the learning task. This is 
particularly Important, because it points out the presence of higher- 
order interactions, as well as an interaction between ability and task 
difficulty. 

Another excellent study (according to Cronbach and Snow) indicating 
higher-order as well as simple interactions is that of Maier and Jacobs 
(1966). In this study, some classes in Spanish had programmed instruc- 
tion (PI) only, some had PI plus live instruction, and others had live 
instruction only. In addition, students were tested for general intel- 
ligence, Spanish language ability, and attitudes £d)out Spanish. Results 
indicate that a favorable attitude toward Spanish was associated with 
PI plus live instruction for hlgh-lntelllgence students. Second, there 
were indications that low-ability students tended to favor PI and hl^i- 
abillty students tended to favor live instruction. Perhaps the most 
significant finding was that some teachers got better results under one 
set of techniques and student characteristics than under others. It 
appears that hlgh-IQ students dp better under PI plus teacher when the 
teacher favors the innovative method. We shall return later to this 
topic of student- teacher-inethod interaction. 

Although far from conciusive,1diere is some slight evidence to 
support the notion that: low-aptitude (general intelligence) students 
may respond differently to some progrjg^ed feat ures than do high^ap tl tude 
; students , Well-structured programs, may be more effective for dtiller 
Individuals , and perhaps bri 

ones to a scrambled presentation, : In geneirali however, support for . 
an interaction between ^rogranm and student aptitude is 




meager,. 



• • 



The issue 1 of meaningful versus ■rote' learnlt^ has a^ 
intrpductotyl/psycholbgy; textscAUsWlly : say 



more easily learned. Rote learning Is generally considered to require 
less ability, and one Is led to expect an Interaction between meaningfulness 
and ability. 

Research on meaningfulness of Instruction and Its Interaction with 
student aptitude Is surveyed by Cronbach and Snow (1969). Some evidence 
of an Interaction Is noted, but It Is not. clear what factors actually 
allow one type of student to gain more from meaningful Instruction than 
others. Tuning Is seldom used, so that students who have little or no 
experience with meaningful material are not on a par with students who 
have. Cronbach and Snow (1969) comment on a large-scale, well-designed 
study by Brownell and Moser (1949) that Investigated meaningful versus 
mechanical Instruction In svib traction: 

In half the schools, sxibtractlon was rationalized for the 
children; a major effort was made to explain why certain steps 
were performed In (e.g.) borrowing. But third graders in some 
of the schools seemed im^le to profit from these explanations. 

The authors tell tis that where instruction had been rote in the 
two preceding grades the whole concept of explanation in arith- 
metic was strange to these pupils, and they could not incor- 
porate the meanings offered. The children, then, had developed 
a positive inaptitude for meaningful instruction, whereas 
other children had been led to the point where they could 
profit from explanation. Now this is Important first In under- 
mining the concept that aptitude or readiness la simply a matter 
of Intellectual maturity. Secondly, It sharply challenges such 
a concept as Jensen's regarding a native Incapacity. Third, 
it destroys any lingering attempt to define "one best wiiy" 
of instruction. Fourth, It urges us in the direction of 
trying to help the pupil who does not xise meaningful Instruc- 
tion effectively by combining techniques that will move his 
skills forward without relying on comprehension, with tech- 
niques that will advance his ability to comprehend. We are 
In no position to write off these third graders as non- 
comprehenders — but we do not * anticipate that simply tuning 
will bring them to the level of mathematical reasoning. 

A series of articles bn the use of advice orgardzers In the 
learning of meaningful verbal naterlals (reviewed above) culminated In 
a study by D. I. Allen (1970) whilch reports evidence ofy aptitude- 
treatment Interaction. As noted earlier, a^ance organizers are highly 
generalized statements read prior to the learning of new material for 

the purpose of faciiitatl^ learning by allowing the student" to relate 
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the new material to his existing cognitive structure. Results Indicate 
that the advance organizers facilitate learning (measured by delayed 
retention) In hlgher-ablllty students but not In lower-ablHty ones. 
This may Indicate that students of lower ability do not have the cogni- 
tive structure necessary to make use of the advance organizers. This 
study raises a number of Interesting questions that need further 
exploration. 



Concept Attainment 



One of the areas of major Interest to psychologists, particularly 
In the field of child development, Is that of concept .attainment and 
cognitive development. In this area, one Is Interested In defining the 
sequence of concepts as they are attained In the development of the 
Individual or In relating concepts to each developmental stage. There 
are several very different theoretical explanations and experimental 
approaches to the study of concepts. These are presented In capsule 
form by Gagnd (1968). Learning theorists who belong to the assoclatlonls- 
tlc school consider concept attainment to be mostly a matter of learning. 
Some schools of thought conceive of concept attainment as depending on 
maturation and biological readiness. 



The most popular theory at this time Is that of Piaget, who focuses 
on the existing cognitive structure of the organism In terms of Its 
adaptation to Its environment. Changes In adaptation are related to 
modifications In the cognitive structure. A model proposed by Gagnd (1968) 
Is based on the cumulative effects of learning (of which association Is a 
small part) within limitations Imposed by maturation. These models and 
others differ markedly In terms of the Importance assigned to learning. 



Theories of concept development have direct relevance to education, 
for they define the factors upon which levels of learning depend. If 
concept attainment Is largely a iiatter of jiiaturation and readiness, or 
level of cognitive structurey^^^t^ the student would not be exposed to 
a task for which he haa. not developed adequate concepts. However, If 



concept attainn«nt depends upon prior cumulative 
structloh must draw only upon the prior learning 



learning, then In- 
and must sequence tasks 
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In a hierarchy according to their contribution to other learning , tasks . 
Gagnd's theory of hierarchical organization is in serious competition 
with the ideas of Piaget, although confirmatory evidence is still 
mostly lacking. 

Regardless of which theory of concept attainment proves most fruit- 
ful, it is evident that differences do exist at a given time among 
students, and over time for a given student. These results have wide 
Implications for the design of instruction and the time at which a student 
is exposed to specific instruction. 



Personality Differences 



No field within psychology is more concerned with Individual dif- 
ferences than the study of personality. There is also no other discipline 
in which controversy is so great, empirical findings less definite, or theory 
more prolific. Reviews of this very complex area, which are found each 
^n fbP Ann ual Review of Psychology , come with several perspectives, 
including the behavioristic approach (Sarason and Smith, 1971) , the psycho- 
metric (Wiggins, 1968), the clinical (Klein,, Barr, and Wolitzky,- 1967) , and 

others. Yet there is Uttle that one can apply directly to education at 
this time, and methods for assessing personality traits are far from per- 
fected, as noted by Sarason and Smith (1971, p. 397): 



The pitfalls Involved in attempting to assess signifl- 
cant personality attributes are nany^and 



"true score" of an individual's standing on a given dimen- 
sion is as elusive as the Holy Grail. 




In spite of these pessimistic comments, there are some general resuU 

from personality studies with implications in some indirect way for; v. ^ 
education., v .r 

' Thereia a growing conviction and body of supporting evidence that ,, 

personaliBr dlffatOTceS; e^st beween^^^ 

achiever. 

p, 534)-; sxjmmartze: V. •' 

High achieves show strong in terhallzat ion of valti^^ 
indicated by reisponsibility-and socializatipn.^;^ 

have high achievement motivat ion, in regard to both inde- 
pendent and conforming spherfesvmey are, however, iw ^ 
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soclal desirability (need to make a good Impression on Its 
own sake) and lack flexibility, apparently preferring order 
and stability. The negative loading for flexibility appears 
In an equation developed on the Italian sample as well, as 
will be important when we come to consider what these studies 
reveal about the nature of the criterion Itself. As Gough 
and Fink (1964, p. 380) point out, the pattern of the achiever 
"Is not a pattern of creativity or Innovation, but rather that 
of constructive adaptation to a world In which one's circum- 
stances are modest and one's destiny limited." 

Cronbach and Snow (1969) discuss a study that shows an Interaction 
between degree of meaningfulness of Instruction and "overachievers" 
versus "underachievers." The overachievers showed better performance 
on the less meaningful material, and vice versa for the "underachievers." 

The concept of anxiety Is one of the cornerstones of personality 
theory, and has also become a major factor In studies of learning. 

Adelson (1969, p. 231) began a review of the topic with this statement: 

Anxiety was the most popular single topic in per- 
sonality this year. 

and later (p. 233): 

After all these years, and after literally hundreds of 
studies of anxiety, there Is still no general agreement as 
to what the commonly used scales are In fact measuring, 
whether it Is drive level, maladjustment, effect, degree of 
defensiveness, or several of these In some Interaction. 

In the latest review, Sarason and Smith (1971) quote suggestions 
that much of the confusion results from a failure to distinguish between 
anxiety as a stable personality trait and anxiety as a temporary emo- 
tional state. 

In spite of the confusion and ambiguity of the entire area of 
anxiety research, a few suggestions are promising. Across many studies' 
there are Indications of an Interaction between anxiety and Intelligence 
on cognitive performance; anxiety appears to enhance the performance of 
low-ability students and decrease the performance of hlgh-ablUty ones. 
Cronbach and Snow (1969) report an apparent Interaction between personality 
and Instruction. It appears that structured Instruction was better for 
hlgh-anxlous , hlgh-conq)ulslve children. For the child who was neither 
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anxlous nor compulsive, both structured and unstructured methods were 
about the same. Cronbach and Snow point out that flaws In the design 
of the experiment make It dangerous to generalize. It Is possible that 
In some schools and for some students the unstructured method would 
achieve better results. 

Student attitude and motivation are undoubtedly major determinants 
of achievement level. In applied research, much of the work along these 
lines has attempted to change the student's attitude about education or 
to Increase his motivation. Another line of research, mostly In the 
laboratory, has attempted to measure attitude and motivation and to 
relate them to outcome. Some studies have Investigated the relation of 
motivation level to teaching technique and classroom structure. 

A particular aspect of motivation that has received much attention 
Is achievement motivation, referred to as need-achievement. It appears 
that achievement motivation Is a particularly persistent personality 
characteristic (Ryder, 1967) and one that Is more related to cognitive 
maturation and Innate ability than to early experience or child rearing 
practices (Heckhausen, 1967). Other findings (reviewed In Hartup and 
Yonas, 1971; Flavel and Hill, 1969; Dahlstrom, 1970) indicate that 
achievement motivation has different antecedents In young children than 
In adolescents. Adolescent and later achievement motivation seems to be 
related to parental and social rewards and punishments, whereas at a 
younger age It seems to be related to an assertion of autonomy. 

Cronbach and Snow (1969) review the literature on motivation that 
Is related to the theme of aptitude- treatment Interaction. Theory pre- 
dicts Interaction between need-achievement and educational treatment, but 
attempts to demonstrate It experimentally have not been successful. 
Interactions are sometimes reported, but they are small. The tasks 
used In most studies make It difficult to extrapolate to classroom 
learning. In addition, many of the studies are made with college stu- 

I 

dents, and as pointed out above there are Indications of differential 
antecedents depending on age. 

The Increased national Interest In academic achievement (particu- 
larly reading and mathematics In the early grades) has caused a certain 

105 
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amount of alarm concerning possible neglect of other factors in student 
growth. The focus on achievement and the Implementation of account- 
ability systems to monitor and enhance certain cognitive skills intro- 
duce the risk of stifling noncognltlve growth. Emphasis on rote learning 
(and it is generally agreed that most compensatory and achievement- 
oriented programs emphasize rote learning) occurs at the expense of 
creative development. It is a popular lament among individuals who are 
identified as creative that formal education is in many respects a 
liability to creativity. Although these self-reports may not be par- 
ticularly reliable, they should not be Ignored. Research on creativity 
tends to support the notions of such self-reports, although studies of 
creativity are not hl^ly definitive. In reviewing the research on 
creativity, Klein, Barr, and Wolltzky (1967) note: 

Psychologists use widely differing criteria in studies 
purporting to deal with creativity, ranging from the careers 
of eminent people (which are obviously worthy of considera- 
tion), to the idea of creativity In Interpersonal relations 
(which m^es one wonder whether this Is really "creativity") , 
down to measures of sales productivity and customer service 
(which can cheerfully be Ignored). Furthermore, even when 
outstanding achievement is the criterion, it usually does not 
Include what most Informed nonpsychologists consider to be 
creativity, that Is, humanistic and artistic creativity. 

Reporting on a study of creativity in children, Hartup and Yonas (1971) 
suggest that: 

...[there is] no clear support for the use of either test or 
gamellke contexts In assessing creativity. Scores depend on 
the task, the measure of creativity, the anxiety level of the 
subject, and sex. 

In simunarlzlng recent studies of creativity, Dahlstrom (1970) states: 

At the present time, therefore, avall^le evidence 
suggests that the creatl>d.ty process Involves a variety of 
enhancing variables: Interest, Involvement, seusltlvlty, and 

self-confidence; and a variety of Inhibiting variables: fears, 
self-doubts, and disabling sets and adaperceptlons acting 
jointly to detersdne the degree of expression of whatever 
the level of skill and proficiency of the individual for 
that sltoatlonal demand will permit. 

Dallas and Galer (1970) provide an extensive review and penetrating 
analysis of the problems, issues, and results in studies aimed at 



- 90 - 



identifying creativity. Research on creativity is marked by a glaring 
deficit of replicative and follow-up studies; but in spite of these 
deficiencies, the authors are able to conclude (p. 67): 



Despite differences in age, cultural background, area of 
operation or eminence, a particular constellation 
logical traits emerges consistently in the creative individual, 
and forms a recognizid>le schema of the creative 
This schema Indicates that creative persons are 
more by interests, attitudes, and drives than by Intellectual 
abilities . Whether these characteristics are consequences or 
determinants of creativity or whether some are peripheral and 
of no value is moot. These questions remain insufficiently 
approached and elucidate d • 



It is evident that no one la in a position to write a formula that 
defines creativity. However, it is equally apparent that, in spite of 
many problems with the research, much is known about the characteristics 
of creativity. The creative person appears, among other things, to be 
•independent in attitudes and social behavior and not much concerned 
about his impression on others. An educational program mainly interested 
in behavioral conformity and standardised achievement has little of 
positive value to offer the creative person. Accountability systems 
that at present focus only on achievement in rote learning may well 
have the effect of further alienating the creative student, especially 



in the early school years. 



variy nevelopment and Learning 

Psychologists, especially psychoanalysts, have for a long time 
stressed the Importance of the very early years In the development of 
patterns of behavior that are particularly persistent. By the time a 
child reaches school, these patterns cannot be modified by the school 
The time to affect cognitive and noncognitive factors in development 
is during the pre-school years. Kagen (1970, p. 9) writes: 



The Idea of this suggestion rests on the Motion 
a child’s experience with his adult caretaker during the first 
24 months of life are major determinants of the qu^ity of 
life motivation, expectancy of success, and cognitive 
abilities during the sdiool years. 



He then reviews data that support this suggestion. 




i 

f 
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Support for the importance of early development comes from a wide 
variety of research reviewed every year In the Annual Review of Psycho- 
logy under the heading of Development Psychology. Other support comes 
from the recent and growing Interest In "critical periods" of develop- 
ment during infancy which determine life patterns. Most of this research 
has been conducted with animals, although there Is supporting evidence 
from research and observations on humans . 

The importance of early experience for education Is the topic of 
a book edited by Denenberg (1970), which Is somewhat slanted toward the 
growing Interest at the federal level In day-care centers, and the con- 
viction that any really meaningful change In the educability of the 
culturally deprived will come through modifying and directing very early 
development of motivation, learning sets, attitudes, and values. 

The Ypsllantl Carnegie Infant Education Project Is one attempt to 
modify the educability of culturally deprived children by working with 
the mother and child. At the last report (Lamble and Welkart, 1970), 
the project had been In operation for only one year, but Interim results 
show the program to be effective. The authors state (p. A03) : 

Perhaps the most Important observation is that the process 
of a teacher, a mother, and an Infant getting ready to learn 
together Is even more critical than what Is actually done. 

To be sure, the teacher must have ideas and "experltlse" to 
assist the mother and infant In leaztilng, but that Is a 
long way from simply providing a family with a series of 
exercises. 

There is little doubt that major determinants of learning style and 
ability are fixed In the early life of the Individual and that environ- 
oent plays a dominant role. A thoughtful discussion of the effects of 
environmental deprivation on learning Is provided by M^on (1970) . Per— 
hc^s the moot dramAtlc demonstration In the literature is Skeels' study 
of the effect of maternal care on institutionalized children (1966) . Many 
people concerned with education express the belief that. If successful, 
preschool education and training will allow for the development of students 
with better dispositions and abilities for learning. Many of the diarac- 
terlstlcs of students that appear as given at school age — such as learning 
set and style, motivation, attitude, and concept attainment — nay be open 
to modification In preschool years I fi R 






However, It has been pointed out In an extensive review of the 
literature (Steams, 1971b) that organized preschool Interventions 
through day care. Head Start, and other programs aimed at children 
between ages two and six have shown quite anblguous and contradictory 
results. It Is not possible at this stage to offer convincing evidence 
that early childhood interventions are more likely to Improve educa- 
tional effectiveness, by standard measures, than arci the regtilar 
school programs, beginning at age five or six. 
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V. THE ORGANIZATIONAL APPROACH 



As noted in Section I, the basic poiiit of view of the organizational 
approach is quite different from that of the input-output and process 
approaches. By this approach, better educationaD outcomes for individuals 
are supposed to result from improving the functioning of the organizations 
that deliver the education. The school is seen as having to adapt to the 
needs of a changing set of students and to a changing set of pressures 
from the outside. Consequently, focus on the output side is on deter- 
minants of innovativeness and responsiveness, and focus on the input side 
is on rules, incentives, procedures, leverage, and so forth. 

Although there is a very large body of literature in educational 
administration and organization, it is rare to find a work that defines 
outcomes in away that permits comparisons. The studies are not often 
quantitative and rarely address the same Issues. The primary mode of 
analysis has been the case study, and tests of internal validity are 
practically nonexistent. We have as yet found no review articles that 
try to put the findings together. In a sense, the present section is 
our own atten^t to do this job. 

After a general survey of the work on educational organizations, 
we used the following criteria to settle on eight studies (books) for 
review here: 

(1) The studies were done with an Intent to compare and generalize - 
to draw "lessons'- — rather than to make pure descriptions. 

(2) There was some attempt to discern differences in outcomes — 
however defined — as a function of organizational rules, 
incentives, or behavior. 

(3) The studies concerned important policy issues. 

The ei^t studies selectad encompass withln-syscem studies and 
cross-system studies.^ Only four are quantitatively oriented (Anderson; 

^Anderson (1968) ; Crain (1968) ; Glttell and Hollander (1967) ; Gross 
and Harriott (1965); Havl^urst (1964); Leggett (1969); Rogers (1969); 
Jones, Kelly, and Garms (1966). 
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Crain; Gross and Herrlott; James, Kelly, and Garms) and really seek to 
test hypotheses in a rigorous way. Table 1 indicates where an explicit 
identification by city can be made of the achool systems studied. Where 
the studies address approximately the same issue and enough data are 
reported, study findings (•.an be directly compared. 

The following statements constitute our effort to extract meaningful 
propositions from the studies.^ Although they are based on an examination 
of all eight studies as well as Uie literature on educational organization 
and administration, we use quotations that express the ideas most clearly: 

Statement 1: There is a positive correlation between size 

^ of system and degree of centralization. 

Statement 2: Large educational bureaucracies and large numbers 

of rules decrease innovation and adaptation. 



It has been known for many years that extreme school d^trict 
size has a deleterious effect on the adequacy ot the educa- 
tional twrograms and on returns for money spent, com- 

plexities of giant operations appear to be such that staf_ 
cocmunication , public expectancy, and unit variability are 
seriously hampered.^ 

In analyzing the six systems listed for this study, Gittell and 
Hollander (1967) find; 

The results of the study support Austin SwensooU 
that "large systems appear to have an absolute rigidity that 
defies the forces which are so important in shaping the 
operstions of smaU systems." How psrsdoxlcal it 
those very school systems which fsce far-roschiw; changes in 
their comsunities and cUentele are least adaptive 
lact, resistant to meaningful innovation. Outputs of th e 
X cities were almos t non^^exlstent In terms of tanglM^ 



volume of studies in this approach is so large, and the criteria 

for internal consistency so unclear, that we Orwell-known 

the number of studies we covered to a representative sample of wexl known 

work. 

^Gittell and Hollander (1967, p. 1). Note the lack of emphasis on 
studenri^levenent. The authors go on to say that achievement tests have 

little usefulness in comparing cal and llMte^the use- 

*-h<z wpl^ted Jnflucnce of socio-economic factors limits tne us 

Mn^srtf’^aS'^AurriTcomparlng «.d evaluating flaeal “0 

^^ope«tione,...We determined to try ». alternative 

measure output at the margin, la terms of the innovation in a school 

district" (pe 2). 
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Table 1 

GEOGRAPHICAL COVERAGE OF SCHOOL SYSTEMS STUDIED 




Anderson 


Crain 


Glttell and 
Hollander, 
Gross and 
Harriott 


Havlghurst 


Leggett 


James , 
Kelly, 
and 
Ganns 


Atlanta 


X 










Baltimore 


X 


X 






X 


Baton Roiige 


X 










Bay City® 


X 










Boston 










X 


Buffalo 


X 








X 


Chicago 




X 




X 


X 


Cleveland 










X 


Columbus, Ga. 


X 










Detroit 




X 


X 




X 


Houston 












Jacksonville, Fla. 


X 










Laundale® 


X 










Miami 


X 










Mllvaukee 










X 


Montgomery 


X 




* 






Newark 


X 










New Orleans 


X 










New York 




X 








Philadelphia 




* 








Pittsburgh 


X 


X 




X 


X 


St. Louis 


X 


X 






X 


San Francisco 










X 


Washington 








X 




Unidentified x 
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effectlve Innovation with widespread and relevant Impact 
on the system . 

Rogers presents similar results: 

Historically the system [New York] heis become progressively 
more centralized, with central headquarters, officials 
responsible for decisions on even the most trivial matters 
— from providing light bulbs, door knobs, and erasers, to 
deciding on transportation facilities ... and the trend toward 
Increased centralization, which complicates administrative and 
pedagogical problems even in white middle class areas, makes It 
much harder to run the schools In ghetto communities. It is 
in such areas as Harlem, Brownsville, and Bedford Stuyvesant 
that the pathologies of the centralized board have become 
most obvious (p. 212). 

Evidence for Statement 1 can also be found by putting together 
the eiiq)lrlcal findings on budget processes of James, Kelley, and Garms. 

The budget process becomes even more centralized. . .a sub- 
stantial part of the control of the budget process passes 
Into the hands of the bureaucracy Itself, simply because of 
the size and complexity of the systems operations.^ 

But the Incentives for change are weak: 

The basic structure of the budget decision In big city 
school systems Is to assume that existing programs will 
continue and to focus bxidget analysis upon proposed changes 
In or additions to the existing programs (James, Kelly, 
and Garms, 1966, p. 91). 

In his cross-sectional study of schools within a large system, 
Anderson finds : 

In general resistance to Innovation Increases significantly 
In large schools....^ As size Increases so does the 
lnq>ersonal treatment of students and In general the 
resistance to Innovation.^ 



^Glttell and Hollander (1967, p. 7) (our emphasis). From their Table 

6 . 1 . 

2 

James, Kelly, and Garms (1966, p. 76, see also p. 93). 

3 

Anderson (1968, p. 146). His word "significantly" means statistically 
significantly, using a chi-square test on two-way contingency tables. 

4 

Anderson (1968, p. 157). Note that the criteria concern not achieve- 
ment but treatment of students and resistance to Innovation. 
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Statement 3 ; Rigidities In a school system can be partly 

overcome by an appropriate choice of teachers. 

The present study also demonstrates that attempts to per-* 
sonallze Instruction as well as Interest In new teaching 
techniques and curricula decrease as the teacher gains 
experience In the schools ... .The Impersonal treatment of 
students and rigid adherence to traditional Instructional 
practices which are characteristic of experienced teachers 
generally, and of many teachers In middle-class schools, may 
thus offset not only the value of teaching experience but 
the educational advantages of homogeneous schools (Anderson, 
1968, p. 163). 

Statement 4 ; Rigidities In a school system can be partly 
overcome by an appropriate choice of 
principals . 



In every type of school certain qualities In the principal 
appear to be essential to making the school operate effec- 
tively. In the Inner-city and common-man types, the principal 
seems to make almost the whole difference between a school 
that holds teachers and gets a fair amount of teaching done 
on the one hand, and a school teachers and pupils are 

demoralized on the other hand. 




If there Is no basis In fact for the widely held assumption 
that administrators who provide a high degree of professional 
leadership will have schools that are more "productive" and 
staffs that enjoy higher morale. It would be a telling argu- 
ment for abandoning the conception of the principal as one 
who plays a leadership role. But If there Is empirical sup- 
port for this common assumption, then to confine the principal 
to routine administrative tasks would be to eliminate a 
force conducive to Improved teaching and learning. The 
positive relationship between EPL [a quantitative measure 
of executive professional leadership] and the teachers' moralet 
their professional performance, and the pupils' learning 
justifies the staff Influence conception of the principal- 
ship and strategies to Increase the principal's professional 
leadership. The findings. In short, offer empirical support 
for a leadership conception of the principal's role, and 
they tmdermlne a major argument for abandoning It.^ 



^avighurst (1964, p. 173). Havlghurst goes on to note that success- 
ful schools have principals who are willing to make Independent decisions 
cd)out their own schools. But see Statement 1 on trends toward 
centralization. 

Gross and Uerrlott (1965, p. 151). This Is the only study of out: el^t 
that sedcs to connect organizational Issues directly to student achievement. 
However, the student achievement measures are based on teacher-observer 
ratings, not standardize ' tests. ' • 
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Corollary to Statement 4 ; A principal’s effectiveness In 
carrying out change Is positively related to 
the amount of support from higher admlnlstrs' 
tlve levels. 



A timid and unenterprising prlclpal was described as follows ; 

"He operates everything by the book, without realizing that 
you-have to adapt the book to the situation. He's afraid to 
operate on his own because he's afraid of how It will look 
downtown If someone questions him." (Havi^urst, 1964, p. 175). 



The stronger the higher administration's approval of a 
principal's Introducing educational change, the greater 
his EPL (Gross and Harriott, 1965, p. 118). 

Statement 5 ; Innovation In a school system depends upon 
exogenous shocks to the system. 



In reviewing the data, however, It Is clear that federal aid 
has In its short history Influenced Innovation In all of the 
cities... for political as well as economic reasons, federal 
funding has pushed school people to Innovation (Git tell and 
Hollander, 1967, p. 22). 

Federal funding for the Introduction of nonprofessionals and 
for the expansion of existing programs Is clearly of prime 
Importance (Leggett, 1969, p. 181). 

It Is evident from the survey of community participation that 
the six city school systems display different degrees of 
openness and receptivity to such groups. In Detroit the 
system appears to encourage outside participation and In- 
volvement which Is not necessarily supportive of the estab- 
lishment. .. .It Is not surprising therefore that Detroit proved 
to be the most Innovative .of the school systems studied 
(Glttell and Hollander, 1967, p. 116). 

Uliat confidence can we have In these statements? We define 
"confidence," as before. In terms of the condition that the studies 
meet criteria of Internal validity and external validity. So far as 
case studies or comparative case studies go, there are no formal 
criteria for distinguishing good ones from bad ones. Presumably, a 
good case study should be "rich" — that Is, should provide an exten- 
sive description of behavior so that the reader is persuaded that this 
Is "the way It really Is." Presumably too, a good case study should 
present hypotheses for testing In other contexts, since the hypotheses 
presented In a given study are drawn from the case. The presentation 
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of hypotheses will have to be conditional, since the analyst will not 
be able to "control" Important variables. In fact, since the case Is 
approached without a particular model In mind. It Is an open question 
which variables to study. This Is an Important choice not often faced 
explicitly, for the resources available for case work are limited. 

The problem of external validity Is as vexing as that of Internal 
validity. There Is a fundamental dilemma that has not been resolved. 

It Is hard to generalize from small samples, but large samples are 
costly. Furthermore, as the number of sample points Increases, one 
Is forced to aggregate, to trade the rich descriptions for variables 
that vary across all the samples . 

Measurement Is a major problem that cuts across both types of 
validity, iieasurlng or defining Innovativeness or flc:dbility or 
responsiveness is a major problem. Often it rests on the subjective 
assessment of the analyst. Other observers may not agree or may define 
Innovativeness in different ways. 




VI. EVALUATION OF BROAD EDUCATIONAL INTERVENTIONS 



In Sections III and IV we dlsctissed the effectiveness of well- 
specified educational treatments and experiments that were for the most 
part designed to measure the impact of specific program characteristics 
such as teaching strategies, curricula, and so on. In this section 
we attempt to analyze the effectiveness of broader educational inter- 
ventions much more directly related to large Issues of social policy. 

These are programs in which treatments are devoted to **gtoups of children 
as a whole in diverse programs taken as a whole” (Steams, 1971a, p. 6). 
Although there is no reason why such interventions need to be limited to 
specific types of children, almost all broad intervention programs have 
been directed toward overcoming the effects of the environment of poverty. 
The most obvious examples are Head Start sad Title I of the Elementary 
and Secondary Education Act (ESEA) of 1965 • There have been a number 
of smaller, more experimental studies of broad educational intervention 
programs as well. 

In such interventions the resources devoted to each child are 
normally increased substantially. This can take the form of smaller 
class sizes, additional instructional personnel (often specialists or 
paraprof essionals) , more individualized instruction, or more intensive 
use of audio-visual equipment. Usually, the eflq>hasls has been on 
achieving program goals and not upon the needs of careful research and 
evaluation. Because of this, research designs are often much less pre- 
cise than those discussed in the earlier sections concerning the process 
approach. Since any number of educational inputs are changed at the 
same time, it is difficult to tell preasely which program features 
are responsible even when there is demonstrated success. Control group 
perfection has naturally been sacrificed to the more pragmatic goal of 
educating the children who need it most. The researdi materials available 
concerning such broad prpgrams of educational intervention are therefore 
considerably inferior to those used in the process discussion ^ove. 
Consequently, these evaluations, discussed in the first two subsections, 
are subject to a number of ar-^'!~*-ical problems. Interventions designed 
basically for research are treated in the third subsection. These studies 
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are much more analytically solid, but they suffer sample size problems | 

in that feu individuals are included in the typical study. The fourth 
subsection deals with attempts to Identify the components of "success- 
ful" interventions. We conclude with a discussion of die costs of com- 
pensatory education. 

j 

FINDINGS FROM LARGE-SCALE EVALUATIONS 

Several large surveys have been made of federally funded national 
intervention programs and one large project conducted by the New York 
City School System. These programs are so well-known that they un- \ 

doubtedly do not require extensive description. The largest program, 
funded at more than one billion dollars annually, is ESEA Title I. Con- 
gress did not stipulate how the funds were to be spent beyond stating 
that they were to be tised for compensatory education of children from | 

culturally disadvantaged environments and that projects taust be approved I 

by an appropriate state education agency. Most Title I projects have I 

been concerned with the techniques used, and taost Title I instruction | 

has been in the elementary grades; a few high sdiool and preschool pro- j 

grams have been conducted. 

The other national program. Head Start, is completely a preschool 



cooqiensatory education program. It has emphasized general child develop- 
ment and not the teaching of skills per se. Most Head Start Programs 
have been "pemlsslve**enrichment" programs, characterized by their 
'\diole-‘child-orlentation, their strategy of watching and waiting, and 
the resultant low degree of structure" (Blssell, 1970, p. 13). The Head 
Start program is also large; an average of more than twelve thousand 
centers annually have been in operation over the past five years. 

Since early in 1968 there has been a second phase of the Bead 
Start program, termed "Follcw Through." In this program Head Start 
children are given additional Instruction in kindergarten and first 
grade. 

The New York City Sdiools* Higher Horizons program was the first 
major effort toward compensatory education, beginning in the 1959 
scdiool year. Each of 52 elementary and 13 hi^ schools was assigned 




an additional allotment of teachers who helped to train other teachers- , 
Improve reading and arithmetic, or perform other tasks at the diacreti*^ 
of the building principal. The funding level was about $60 per pupil, 
ruring the first three years the program structure was left primarily 
to the option of individual principals, after which management was 
centralized id.th most schools having an allotment of three teachers, 
two for academic improvement and one for cultural enrichment. 

The findings from numerotis surveys of these programs (a majority 
of which are for Title 1) are tbat, with the possible exception of 
the Follow Through program, there is very little convincing evidence 
from existing measures leading one to believe that the resources invested 
have made much difference in the progress of children from disadvantaged 
environments . 



ESEA Title 1 

Ibe most pessimistic findings come from the Title I suzrveys. We 
have carefully «;amlned the reports commissioned by the U.S. Office of 
Education for the last three fiscal years (Including a draft report by 
Gordon for 1971), and in addition have read several papers written by 
Independent scholars. We do not atteapt to sunsnarize the results of 
each of these studies separately because they are all quite consistent 
in their findings. The following quotations are representative: 

An analysis of the niading achievement scores of 155,000 
participants of 189 Title I projects during the school 
yaar ending in Jtne 1967 indicates that a child who parti” 
cipated in a Title I project had only a 19X chance of a 
significant achievement gain, a 13Z chance of a significant 
achievement loss, and a 68X chance of no change at all. 

This sa 9 q>le of observations Is ^representative of Title I 
projects. It is, more likely, representative of projects 
in which there was a higher than average investment in 
resources. Therefore, more significant achievement gains 
should be found here than in the more representative sample 
of Title I projects. (Piccarlello, Report For Fi scal Year 
1967 , no date, p. 1.) 

For participating and non-participating pupils, the rate 
of progress in reading skills kept pace with their histori- 
cal rate of progress .... Compensatory reading prograaa did 
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not seem to overcome the reading deficiencies that stem from 
poverty. (U.S. Office of Education, Report for the fiscal 
Y»ar 1968 . 1970, pp. 126 and 127.) 

It will be noted in the following reports of analyses that 
all outcome data indicated a distinctly higher than average 
reading gain for non-participants than for participants. 

(Glass et al .. Report for Fiscal Year 1969 , 1970, p- 6.3.) 

Participants in the compensatory programs continued to shew 
declines in average yearly achievement in comparison to 
non-participants who included advantaged and non-dlsadvan- 
taged pupils.... It was not possible from these data to 
determine whether participants in compensatory programs 
showed a reduced decline in average yearly achievenent . 

(Gordon, Report for Fiscal Year 1970 . 1971, p. 23.) 

These findings are all qualified heavily in subsequent discussion by 
the study authors, who cite problems we disctiss below. Nevertheless, 
the fact remains, qualified or not, a ll the findings themselves are 
consistently negative. 

Head Start 

There have been two national surveys of Head Start.. The first 
was an inquiry early in the program (Wolff and Stein, 1967). It showed 
some positive effects of the Head Start program which, however, dis- 
appeared in the first grade. The report indicated that Head Start 
children who went on to kindergarten or first grades conposed mostly 
of other Head Start children did better than those who had fewer Head 
Start children in their classes. 

The other survey of Head Start is much better known. It is the 
study commissioned by the Office of Economic Opportunity in 1968 and 
known as the Westin^ouse/Ohio University Report (Cicirelli et^., 
1969). Since the Westln^ouse Report was more recent and had a much 
more conpr^ensive research design, we will discuss its findings in 
somewhat more detail. They are sU^tly more optimistic than those 
Just quoted for Title I, although the overall prognosis is still rather 

bleak. 

The Westin^cuse project picked some 104 Head Start centers (out 
of more than 1,200 centers throu^out the nation at random, and all 
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chlldren eligible Co enter each center were Identified. From these 
children were chosen groups of eight who attended Head Start and a 
carefully matched comparison group of children who did not. The 
children In both groups were extensively tested during the 1968*>69 
sdiool year. Since the program began In the 1965 school year. It was 
possible to compare prof,ram and non-program children In grades one, two, 
and three. 

The study found that there were small but significant differences 
In favor of full-year (but not stmaner) Head Start children at the be- 
ginning of grade one on the Metropolitan Readiness Test (a generalized 
measure of learning readiness). But Head Scart children at the be- 
ginning of grade two from either summer or full-year programs did not 
score significantly higher than the controls on the Stanford Achievement 
Test. There were no differences found In children's self-concept or 
teacher ratings of classroom behavior between the Head Start children 
and the control children. When children from full-year programs were 
stratified by region and race It was found that the centers In the 
Southeastern region. In poorer cities, and of mainly Negro con^osltlon 
were more successful than the others.^ 

Follow Through 

The Follow Through program is a program for disadvantaged children 
from kindergarten throu^ third grade who had previously been enrolled In 
Head Start or similar programs. Programs were developed by a group of 
sponsors who had been active In compensatory education. Although there 
Is some built-in variation between sponsors, all programs were Intended 
to develop the academic abilities of the children throu^ such practices 
as reduced class size, small group and Individualized Instruction, tise of 
teacher aides and classroom volunteers, and so on. All programs also 
sought to Increase the sel^-esteem and motivation of the project children. 

The Follow Through program has been evaluated by the Stanford 
Research Institute (1971). Four groups of Follow Through and non-Folhaw 

^The Westlnghouse study was controversial and has been widely criti- 
cized on methodological grounds. However, In a detailed and balanced re- 
view of the controversy Steams (19711), pp. 117-134) concludes, "Head 
Start has been only 'marginally effective' on the a^^erage." 
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Through children were compared where degree of poverty was stratified. 

One group was at the kindergarten level, two In first grade (one had 
been to kindergarten and one had not) , and one In grade two. All four 
Follow Through groups gained more than their counterparts not In Follow 
Through, although In only two (kindergarten, and first grade with no 
kindergarten) were the differences significant. The Follow Through 
children entering In grade one began the year substantially behind their 
counterparts, which means that the additional gain Involved may have been 
In part due to the "regression to the mean" phenomenon.^ There Is no 
Indication In the report thot an adjustment was made for this. 

Higher Horizons 

Finally, there Is some slight e'd.dence of favorable results In the 
Higher Horizons program. There were favorable outcomes for program 
children , on one sixth grade IQ test, on a sixth grade arithmetic test, 
and for grade six reading for below norm pupils. A majority of the 
Higher Horizons findings were "no difference," however (Wrights tone et al . , 
1964) . 

Evaluation of the Evaluations 

If we were to base r^ur total assessment of the value of compen- 
satory education programs upon the findings of the surveys Just dls- 
ctissed. It would undoubtedly be best simmarlzed by the first line In 
Jensen's now famous paper on the herltablllty of native ability; 
"Compensatory education has been tried and it has ..apparently failed" 
(Jensen, 1969). But before so concluding, we 'should have first been 
assured that the survey evaluations used In arriving at such a verdict 
were themselves an accurate description of Vhe real world, and no such 
assurance Is possible, even with a considerable stretch of the Imagina- 
tion. Some of the most Important reasons follow:. 

^Since gain scores are calculated by subtracting Initial test 
scores from later test scores, any error in the inl.tial test scores 
will result In a spurious negative bias in the measured correlations 
between initial scores and gain scores. If the Initial score is over- 
stated, for example, the difference between the initial score and a 
later score will be understated, and conversely. 
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Virtually without exception the analyses on which these evalua- 
tions arfl based did not assign treatment and non- treatment children on 
a random basis . Perhaps the only foolproof evaluation strategy involves 
comparing two groups of children who are identically matched and randomly 
assigned. But these programs were meant by their originators to be 
applied to the most disadvantaged children. Both political pressures 
and the decisions of conscientious educators have almost without excep- 
tion combined to Insure that the children who are placed in treatment 
groups are the most disadvantaged. This being so, the children left to 
be placed in comparison groups are most likely of greater ability and 
from better environments than the treatment children.^ 

The problem would appear to be somewhat less serious in the 
Westinghouse evaluation of Head Start, in which all children wlio 
were eligible for the program were identified and then those who had 
been in the program were carefully "matched” with those who had not. 

But the matching was done ex post and on the basis of race, sex, and 
socioeconomic background (the last necessarily somewhat crudely) and 
not on the basis of ability. Also, as SteartiS points out, the Westing- 
house Study tested different cohorts of children in grades one, two, 
and three, and an equally appropriate interpretation of what was re- 
vealed was that "during the years 1965 and 1966, when Head Start was 
just getting organized, the programs were not as effective in changing 
children’s performance as in the 1967 and 1968 programs" (Steams, 1971a, 
p. 92). Matching was done somewhat more carefully for Follow Through 
where some differences were shown. 



Those of us who have interviewed compensatory education personnel 
extensively have found that it is widely accepted among managers of 
compensatory education programs (Title I in particular) that the most 
disadvantaged pupila are picked for treatment. Evaluation designs where 
experimental and control children are assigned and evaluated in a com- 
pletely random manner with large enou^ group sizes to ins tire meaning- 
ful outcomes are almost non-existent. Only two come easily to mind, a 
demonstration program in San Jose, California evaluated by Rapp jet_al. 
(1971) , and the DlLorenzo New York Study discussed below. This comment 
applies to "on line" projects only, and not to the more research oriented 
projects that will be discussed later. 
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In the Title I surveys the selection of the projects was quite 
obviously not rapreaentatlve of the country a& a whole * The bias in 
project selection is heavily In favor of large anrl urban core-city 
school districts because larger districts normally have somewhat more 
sophisticated evaluation staffs. Hany of these large districts are just 
the ones where the problems are most intractable. None of these surveys 
is reasonably representative of national experience with Title I. 

On the other hand, we must remenfcer Piccariello*s point, cited 
above, that projects in which there were higher than average invest- 
ments in resources are more likely to be included. Since these are 
the projects in which the greatest gains are to be expected, there is 
a built-in positive bias to these national surveys. 

Even when treatment and control groups are selected reasonably 
well, spill-over or ”radiation" effects going from the project to non- 
project diildren may contaminate the evaluation . It is seldom possible 
for program children to be completely quarantined from non-program 
children. The fact that something new and novel is being done in 
the school building can be infectious or it nw*y prompt the regular 
classroom teachers to work harder to keep their children from being 
shown up. 

In addition to the evaluation difficulties previously mentioned , 
the i»«alysis of compensatory education prografits leaves something to be 
desired . As pointed out by Gordon (1971) i 

One often finds a low level of expertise and inadequately 
developed methods. The best educational research scientists 
often choose to work with basic problems in child develop- 
ment, learning, linquistlcs, etc., rather than evaluative 
research (p. A). 

After analyzing over three hundred evaluation reports carefully 
for his study of successful projects Wargo conduces: 

One begins to wonder whetlier the instructional components 
associated with compensator, education programs are inade- 
quate or whether the fault lies in the evaluation proce- 
dures used to determine their effectiveness. (Wargo, 1971, 
p. 27). 
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Desplte the evaluation difficulties just discussed, we hesitate 
in dismissing the findings of the studies altogether. At the very 
least they have a certain face validity demonstrating that most 
broadly funded compensatory education programs are not accomplishing 
large gains in the performance of target children. This is enough 
of a conclusion to cause concern even if we were to conclude that 
compensatory education is not completely without beneficial effect. 

FlNDItlGS FROM SMALL-SCALE EVALUATIONS 

There are three smaller studies of the overall effects of Title I 
projects. One is a study by Riesling (1971a) of a rai;dom sample of A2 
California projects that used the Stanford Reading Test. He found an 
average gain in grade equivalent scores to be below the national norm 
but higher than the rate of progress ascribed to Title I target popu- 
lations xrithout treatment. Riesling’s &aoq>le was picked randomly but 
was subject to the restriction that the district used the Stanford 
Test. It is incertain what bias this restriction may have caused in 
the findings. 

Another careful study was done by TEMPO (General Electric Company) 
of compensatory education programs for 132 schools in 11 school districts 
(Mosback, 1968). TEMPO found that all the children who were in the pro- 
gress for the 1966-67 school year averaged only one-half month’s less 
achievement gain than the national average for all children. This repre- 
sents a hi^er rate of progress than prevlotis rates of gain for children 
in the program. 

Wargo et al . (1971) identified some clearly successful compensatory 
education programs after an erfiaustive survey. Out of more than 1,200 
evaluation reports for screening, »422 candidate programs were identified 
and 326 of these answered a written query for additional Information. 

An in-depth analysis was s»ade of these 326 evaluations and in the end 
only ten were chosen. The reasons for rejection of the other 316 pro- 
grams are set forth in detail 5.n Table 

Sixty-eight projects were rejected becaitse they did not show results 
that were statistically or educationally significant. Only ten projects 
showed significant gains. This implies, at worst, that there were 6.8 
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Table 2 

FREQUENCY OF PROGRAM REJECTION BY REJECTION REASON, EXEMPLARY 
COMPENSATORY EDUCATION STUDY BY WARGO ET AL. 



Rejection Reason 


Rejection 

Frequency Percent 


General Infoimatlon 


1. Unavailable 


16 


5.1 


2. Incoioplete 


36 


11.4 


3. Outside scope 


15 


4.8 




67 


21.3 


Methodology 


1. Unclear or incon^lcte 


15 


4.8 


2. Sample 


38 


12.0 


3. Inproper comparison or norms 


12 


3.8 


4* Inadequate measures of cognitive benefit 


60 


19.0 


5. Inadequate treatment 


8 


2.5 




133 


A2.1 


Evaluation 


1. Unclear or incomplete 


22 


7.0 


2. Inproper design 


20 


6.3 


3. Pre-treatment reference inadeqtMte 


3 


1.0 


4. Statistics^ 


3 


1.0 


5. Statistical significance” 


42 


13.3 


6. Educational significance” 


26 


_8^ 




116 


36.8 


Total Rejected 


316 




Total Revicved 


326 





improper selection, use, or interpretation of statistical tests. 

^Gains and/or differences favoring the program are tnrellaible; 
that is, they cotild occur by diance more than five times in 100 
replications • 

^Achievement test gains that are less than expected in average 
children during a coa^arable period of time or. If nonre available, 
gains not significantly greater than those of a comparable control 
group. 

Source; Wargo et al .. 1971, pp. 16, 17, 26. 
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tlnes more failure than successes In well -evaluated programs. Even If 
we assume that one-quarter of the failures might have been eliminated 
on other grounds, the fallure/success rate would still be five to rme. 
This Is far from encouraging, although It should be noted that the 
restrictions Imposed by Wargo and associates were rather stringent with 
respect to statistical and educational significance. The only projects 
chosen were those with 30 or more pupils and whose pupils gained at 
least as fast as the national norm. Hence, many studies showing gains 
In the neii^borhood of t^>ase found by KlesUng and TEMPO would have 
been discarded. 

INTERVENTIOKS DESIGNED BASICALLY FOR RESEARCH 

There are a number of broad educational Intervention studies of 
high quality designed In large part for research purposes. Because 
of their good evaluation designs their findings are probably quite 
trustworthy. It would not be possible to describe all of these studies, 
even only the best ones. In great detail. We can present only a few 
examples of projects we considered Instructive.^ They were not 
picked on the basis of the amount of educational gains eidilblted, 

2 

although we are not sure they are a random representation either. 

These examples provide an Impressive amount of evidence that educa- 
tional Interventions can yield substantial results. It must be re- 
membered, however, that few of the programs described below have been 
replicated. We have no assurance that any of these programs would be 
successful If Implemented on a large scale. We emphasize that these 
exaaq>les show that Interventions can work. It does not demonstrate 
that any particular Intervention will work or Is even likely to work. 

^Appendix B contains a table summarizing the analytical results 
from a number of educational Intervention studies. 

2 

Our choice of projects to discuss and much of the substance of 
the discussion that follows Is heavily dependent tq)on excellent studies 
of this literature by Steams (1971a) and Bissell (1970). 
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Stanford University: Computer Assisted Instruction 



One of the more interesting sets of experiments in recent years 
has been the Computer Assisted Instruction (CAI) based at Stanford 
University. Jamison et al . (1971) report on experiments for more 
than 200 elementary grade children situated in California and rural 
Mississippi, whose regular classroom instruction was supplemented 
daily with about ten minutes of drill in arithmetic and reading skills. 
The arithmetic curriculum is arranged sequentially in concept blocks 
composed of a pre-test, five drills, and a post-test. The reading 
curriculum is based on phonics divided into seven content areas. 

Pupils sit at a console and answer questions that are sequentially 
more difficult. If the pupil misses an item, the program takes him 
back for the appropriate review. 

The effect of the instruction was quite pronounced, with signi- 
ficant differences in both reading and arithmetic. Differences were 
significant in all six grades in the Mississippi study. In California 
the CAI cliildren outperformed the controls in three of six grades; but 
the overall average gain of the CAI subjects was only slightly more 
than the controls . 

Gordon; Early Child Stimulation Through Parent Education 

Gordon (1971) evaluated a home training project for poverty 
mothers. The object of the program was to accelerate infant learning 
patterns and to teach disadvantaged mothers how to continue to be 
effective in teaching their children in the home. The treatment was 
composed entirely of home visits by paraprofessional "parent educators," 
averaging about 30 visits per year, in which the instruction was directed 
toward the mothers rather than toward the infants directly. 

The children were all bom at one hospital in a six-month period 
and were randomly assigned to groups before the mothers were contacted. 
At the end of 12 months, experimental children exceeded the controls 
in performance of 23 out of 30 tasks in the learning series used. 

Eight of the differences were statistically significant. Three of 
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these were tasks that had aot yet been reached by the parent educators, 
which would seem to indicate that mothers were successfully generalising 
their instruction into areas not specifically covered. On the Griffiths 
Mental development Scale, four of six subjects were significantly in 
favor of the experimental children, and the effect on the other two 

WcLS pOSitiVB 3S W6ll • 

At the end of the second study year the children Whose mothers 
had both years of Instruction were best, those with one year starting 
on the first birthday next, those from the third month to the first 
birthday next, and controls last. All differences were significant 
except that between the last two groups. 



Karnes; Ameliorative Preschool 

Karnes (1969b) evaluated a program for economically disadvantaged 
children in Champalgn-Urbana, Illinois. The program concentrated on 
language development and had class sizes of 15 (five each from low, 
middle, and high IQ groups with the highest in the low 90s), which 
met three 20-minute periods dally for Instruction in mathematics con- 
cepts, language development and reading readiness, and science-social 
studies . Most teaching took place in small cubicles containing 
materials appropriate to the three content areas. 

Teachers adjusted their teaching according to pupil performance 
on the Illinois Test of Psychollnguis^c Abilities. Language develo, 
ment was continually emphasized, and the teaching strategy centered 
on verbalization in conjunction with concrete materials. Pupils in 
the program gained about twice as many points on the Stanford-Blnet 
over the two-year period as a control group. 




THLorenzo; Pre-Kinderga rten Programs In New York St^ 

This .is a study of pre-kindergarten programs in eight school 
districts in New York State (DiLorenzo, 1969). Project staff assigned 
experimental and control groups completely at random in all eight 
participating districts. Goals in all were concentrated on language 
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development, self-concept, and physical growth. All children were 
disadvantaged according to the Warner Scale. 

Eaclr district had one 150-minute class daily (with one exception 
four times weekly) . Teams of observers working in pairs made extensive 
observations concerning the teachers in each district. The districts 
had programs with differing amounts of structure, but a majority of 
the districts organized the children’s activities externally to at 
least a moderate degree. 

The overall effect of all programs was to make a slight difference 
on the Stanford-Binet. Language development on both the Peabody Picture 
Vocabulary Test and the Illinois Test of Psychological Abilities was 
significantly better in favor of experimental groups. Despite these 
gains, however, the disadvantaged children gained little on the controls 
from the non-dis advantaged population. The gaps between the two groups 
on the Binet and the Peabody were slightly reduced. 

Project Conquest, East St. Loui s, Illinois^ 

In the Project Conquest reading remediation programs, children 
who have the potential to read at their grade level but who are more 
than one year behind the norm are selected to receive four 15-minute 
periods of instruction in reading rooms or two 45-minute periods in 
reading clinics weekly. The instruction is individualized, with con- 
siderable problem diagnosis. Teachers are trained specialists and 
teacher-pupil ratio is usually one to six. There is also a program 
of extensive in-service training for classroom teachers in techniques 

of remedial reading training. 

rmlng the 1969-70 school year, 87 elementary school pupils in the 
readhig roosB gained about 1.3 awnths per month of Instruction on the 
Gates Primary Reading Teat and 268 children of the clinics gained 
about 1.4 months per month of instruction (Wargo, 1971). 

T.ONGITUDINAL ANALYSIS 

The programs described Just above are all Interventions In which 
there was a least some treatment d«^^ the same year as the testing. 
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Such outcomes, even though Impressive, do not answer the most Important 
questions concerning broad edu< ational interventions, which are concerned 
with longer term effects. Does a year or two of educational intervention, 
even if highly successful, have effects that are still visible three or 
four or more years later? In general, if program children do not have 
their Intervention education constantly reinforced, will it last? 

We might be justified in considering the results of the Westing- 
house study as being longitudinal, since it considered Head Start 
children one, two, and three years after they were in the program. 

As there were no significant differences between Head Start and non- 
Head Start children in first and second grades , it would appear that any 
gains had faded out. However, as Steams (1971a) points out, since the 
same children were not retested, this result could be explained by 
earlier cohorts having poorer programs. 

Intelligence Test Findings 

One of the most interesting longitudinal -studies is that by Gray 
and Klaus (1970) of a group of 88 Negro children bom in 1958 living 
in the Upper South. The children were divided into two experimental 
groups and one control group, all of whom lived in the same ghetto- 
like community of '25,000, a second control group, 27 children, was 
drawn from residents of a similar community 65 miles away. The first 
experimental group (T^) attended a ten-week pre-school program during 
each of three summers beginning in the summer of 1962 and received 

weekly visits from a specifically trained home visitor for three years. 

2 

The second group (T ) had the same treatment except that it began a 

3 4 

year later and lasted only two years. The local (T ) and distant (T ) 
control groups received all of the tests but no Intervention treatment. 

After the program ended all four groups were retested each year 
through the seventh year of the program, which was 3-1/2 years after 
the last home visitor contacts. The pupil populations were extremely 
stable over the whole time period, which meant that attriti.on was a 
minor problem. 
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The Grey and Klees study Is especially Interesting. Treatment 
Levels were not massive (cost levels per child were only sUghtly above 
J300, see below). As Gray and Klaus state: 

Perhans the remark* le thing is, with the relatively 

Snrmonths for «o or thr« y--^;„f/—To^rve^%o 

S:ft Te ;:::si™ "ffectrof : mw mcome »ome in which 
the child had lived since birth onward (Gray and Klaus, 

1970, p. 13). 



ROGRAM characteristics AggnriATED WITH SUCCESS 

Although this section is primarily concerned with broad educational 
nterventions and not with the effects of specific treatment charac- 
eristics in detail, nevertheless Interesting differences exist In 
™es of interventions. This section explores these differences and 
llecusses evidence of which kinds of Interventions are most success u . 

Bissau has constructed a very useful typology of educatlo.>al^^ 

interventions based primarily on the amount of program 
,r the "amount of external organization and sequencing of children 
experiences" (Blssell. 1970. p. 11). Her concept of structure also 
includes the degree to which objectives are organized hierarchical y 
and the degree to which the role of the teachers is directive or 
non-directive. Programs that are not structured are designated a 
"permissive." Using the structure concept and aso the degree to 
which interventions are devoted to purely cognitive goals Blssell 
constructed a five-fold typology as follows (Blssell. 1970, pp. H-13) . 

„ Pp.p„-ic=c-ivp Enrichment programs, which have "multiple 

letting ohllWs needs^Jtera^^^ 

ftmau^ to children's experlen^^^ Example: Most 

Head Start programs. 
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Enrlchuent programs, which also hwe « 

SDeclfic emphasis on language development. The " 

of Sese ^grams centers around the teacher's capitalizing 
on informal Experiences for learning, thereby Pi^oviding 
moderate degree of structure to children s experienc 
Example ; "Traditional" preschool programs for disadvantage 
children, such as that in Karnes study. 



cfrncMired Coanitlve programs, which have "ctjectives 
orIentertowalds -5S development of learning 

and relatively heavy specific teacher’^ 

The strategies of these programs revolve around the teacner 

directing activities in which the children ^ 

sometimes in prescribed ways and sometimes flexibly, 
programs in this category range from 

moderate degree of structure to those providing ^ ^ 
degree of structure to children's experiences. Examp le. 
Karnes' Ameliorative Preschool. 



Informational programs, which 
oriented towards teaching specific ^ 

particular, language patterns, -^e 
programs involves the teacher s directing 
Children participating in them in prescribed way . 
resultant structure in the children s experlen 
tremely hi^." Example ; Bereiter-Engelmann Programs. 



f^^mofnrad Environmer^ programs , which 

oriented towards the development of learning processes. 

program, have a heavy spe^lc^asls on 

language development, «he« -- traatlo 

rorr-"-f;-rmV^^^^^ 

room materials and the teacher s mediation of ^im- 
material interaction. This strategy provides a mo derat e 
degree of structure for children's experiences. ExfflEiS.- 
The Montessori Method. 



Blssell also outlines a second attribute of Interventlve programs 
Ich she terms "quality," by which is meant the nature and amount of 
cgram supervision and personnel training: The degree of coordina- 

on and cooperation of program staff would probably also be inclu e 
her idea of quality, since it is presumably highly related to the 
iture of program management and supervision. 

several sets of research results can be analysed using the criteria 
,st developed, and most of the findings from these seem to assume a 
mllar pattern, at least for short-run effects. The pattern is that 
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program success is positively related to both program structure and 
program quality. 

A number of writers have fotmd the structure result. Gordon, 
after surveying all the research on Title I projects nationwide, 
concludes , 

The tightly st ructured programmed approach including f^quent 
~^m.m^dlate feedback to the pupil, combined with a tu^r m 
relationship, individual pacing and somewhat Individualized 
progr^^ning are positively associated with accelerated pupil 
achievement. (Gordon, 1971, p. 24 j emphasis in the original.) 

The painstaking work by Hawkrldge, Wargo, and their associates 
at the American Institutes of Research, which has already been men- 
tioned, is difficult to sunmarize briefly because it is composed of 
descriptive material concerning the successful programs identified. 

The same is also true of Klesllng's study of successful California 
Title I and Senate Bill 28 (a California demonstration program) pro- 
jects. Their results strongly support Blssell's notion of the import- 
ance of good program supervision and personnel training ("quality"). 
Careful planning and good teacher training are mentioned both by 
Hawkrldge at the preschool level and by Klesllng. Hawkrldge mentions 
the careful specification of objectives as being Important at all three 
educational levels. Perhaps this can be interpreted both as a quality 
and as a structure characteristic. 

Another study that carefully traced differences in the effect- 
iveness of program types that can be related to the structure cri- 
terion is that by Miller and Dyer for two kinds of kindergarten after 
four types of Head Start. The four types of Head Start (exclusive of 
controls) were; Berelter-Engelmann (Structured Informational), DARCEE 
(Structured Cognitive), Montessoxl (Structured Environment), and 
Traditional (Structured Enrichment). The Follow- Through kindergarten 
was a highly academic program structured as a token economy where 
the school day was divided into earn and spend periods. Children not 
in Follow Through kindergarten were placed in regular kindergartens 
of the, Louisville, Kentucky city schools. 
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The evaluation design In the mUer study was carefully drawn 
and the findings are ,ulte rich and cooler. H«ever. the best single 
a«ry of the findings Is probably the results for the Metropollto 
Seadlness test at the end of kindergarten. The nost striking res s 
are the unambiguous superiority In the performance of Follow Throng 
children. Otherwise, the findings with respect to type of He^ a 
are more ambiguous. The two groups that did best had Follow Throng 
(relatively structured) along with relatively unstructured Head Start 
programs. The most highly structured program - Berelter-Engle^n -- 
yielded good performance for Follow Through children but was a dis^ter 
far non-Follow Through dtlldren who scored seven points lower than ^e 
regular kindergarteners who had had no preschool at all. Montessor 
children did worst of the Follow Through groups and best of the tra 

tlonal groups. 

on the Stanford-Blnet, the Berelter-Engelmann children started 
highest at the beginning of kindergarten (thus they did best in pre- 
school) and the Follow Through group ended four points h g er an 
any other group, while the non-Follow Through Berelter-Engel^nn 
children fell five points. The traditional Head Start plus Follcw 
Through co*lnatlon was newt best followed by the regular klndergar en 

OBRCEE and Hontessorl children. On the Dog-and-Bone 
comparing all kindergarten children by Head Start program he Monte 
son children did best, followed by DARCEE. controls, traditional, an 
Berelter-Engelmann, in that order. 

Althou^ the Miller-Dyer findings appear confusing, there are some 
generaUzatlons that can be drawn. Short-term cognitive performance 
L better In the more structured programs; children in tradl lonal 
and Montessorl programs do better with such skills as curios ^ and 
inventiveness. Also the ,»re structured programs - especially 
Berelter-Engelmann - seem to create more dependence on the part 
the children coward the treatment, and therefore these children, 
when thrown Into a regular "slnk-or-swim" school situation where 
there Is much less Individual attention, seem to lose their former 

high gains rapidly. 
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The last set of studies we will discuss with respect to features 
of successful programs are those-, also more broadly discussed above, of 
Karnes, DlLorenzo , and Welkart, as re-analyzed by Blssell. The methodology 
used by Plssell, In which she co-varled for beginning score level (which 
was not done by the original authors) Is what we consider to be the best 
approach for controlling for the "regression to the mean" phenomenon. 

The results of Blssell 's re-analysls of these three programs are 
strikingly similar, with the more structured programs achieving the better 
results. In some further analysis according to degree of disadvantaged 
pupil environment, Blssell concludes that the more highly structured 
programs make the largest difference for the most disadvantaged children; 
less structured approaches are more effective with less disadvantaged 
children. Blssell concludes that the most disadvantaged children probably 
have difficulty In being self-directing and require constant supervision 
and guidance much more than the relatively more advanced children. 

COST OF COMPENSATORY EDUCATION 

Evaluations seldom provide the data required to compute the costs 
of educational programs that are built Into a superstructure of an 
. existing educational program. However, for the purposes of broad 
policy analysis , calculations to the last dollar are not particularly 
necessary anyway; broad ball-park estimates can be quite Instructive. 

The original funding for Title I equalled a suitl half the average 
state expenditure per pupil for each disadvantaged child, the Impll ' 
cation being that this much (which fell In the $250- $300 range In 
1966) was to be spent on each child. Subsequently Title I has been 
underfunded from the standpoint of this rather rich objective and the 
current average appropriation for each child officially designated 
as poor is less than $200. Head Start was not funded as broadly as 
Title I, but per-pupll expenditures In most Head Start centers range 
roughly around the $300 mark. 

^See note above, p. 105. 
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The New York Higher Horizons program provided, In effect, three 
extra full time teachers, plus some equipment, materials, and so on, 
devoted to schools with enrollments In the neighborhood of 1000 to 1200. 
This amounts to an expenditure level of around $60 per pupil. 

At the other end of the spending spectrum are any number of the 
projects described In the literature that have been exceptionally 
successful. For example, the Karnes Ameliorative Freschool program 
which has been discussed in several places above takes, as neatly as 
„e can tell, the equivalent of three special teachers one hour a day 
for 15 children. If we assign an additional hour a day for prepara- 
tion time, this amounts to one full-time teacher for 15 children, or 
about $800 per year pet child for Instructional costs alone. The 
Berelter-Engelmann program used In the Kames-Teska-Hodglns study would 
require a similar pattern of resource use. The Gordon experiments with 
home training of mothers by carefully trained paraprofesslonal peers is 
also surprisingly expensive. As in many of these program descriptions, 
we found It difficult froa this one to ascertain the exact (or even near 
exact) pattern of resource use. Our best guess Is that, with an average 
of 30 home visits a year (as stated In the report) , with salary for 
parent educators placed at $5500 per year (including fringes) , and 
figuring necessarytransportatlon and supervision costs, the program 
would cost in the range of $500 to $600 per child per year. 

The 1971 publication by Wargo et al. listed ten exemplary projects, 
which is all they could find out of 326 they reviewed carefully. Of 
these, three were In an expenditure range far above current levels, 
even if we allow for reasonable one-time research and development ex- 
penses. These were the Femald School at oaA, which gave highly Indl- 
vlduaized instruction to disadvantaged children at a total cost of $1200 
per dilld (or $400 to $500 more than most public schools cost); the 
Lafayette Bilingual Center in Chicago, which offered English and Spanish 
instruction to disadvantaged Spanish-speaking children, costing $1500 
per pupil; and Project Breakthrough, also in Chicago, a preschool pro- 
ject that used "Talking Typewriters," at a cost of $3600 per pupil. »- 
other seven "successful" projects cost between $100 and $367 per pup 
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Perhaps one of the more useful sets of cost information has been con- 
structed by Kiesling (1971a) on the basis of his observations of successful 
Title I and Senate 611128 projects. Kiesling constructed 12 program 
prototypes closely based upon the configurations he saw in the actual 
programs and constructed cost estimates with the use of a standardized 
list of resource costs. The per-pupll cost varies from one program 
prototype to another and, within each program prototype, per-pupil 
costs vary with the scale and intensity of the program. The minimum 
estimate is $153 per pupil In a program that relies heavily on volun- 
teer aides. The maximum estimate is $445 per pupil for a program in 
which pupils leave the regular classroom to see a specialist in a 

resource facility. 

The Gray and Klaus preschool program had 10-week summer programs 
and weekly home visits by a trained teacher. Assuming a summer employ- 
ment of the teacher at one-fifth her yearly salary, $4000 for four 
aides, and other mlscelleneous expenses, we estimate a figure of 
^out $140 per child per ten-week summer program. The teachers’ 
weekly visits probably cost at least $200 per child per year which 
places the yearly cost in the $350-$400 per pupil range. It should 
be remembered that there were some diffusion effects and so the 
benefits cannot strictly be limited to the program children. 

Barbrack and Horton. (1970) reported a somewhat similar preschocl 
experiment to that of Gray and Klaus. There were three experimental 
treatments: home visits by a professional teacher, home visits by 

paraprofessional peer mothers well supervised by a professional teacher, 
and home visits by paraprofessional peer mothers supervised by more ex- 
perienced mothers who were in turn supervised by a professional teacher. 
Per-pupil costs were calculated by Barbrack and Horton for these three 
treatments to be $440, $300 and $275, respectively. 

The schools in the New York project discussed by DiLorenzo (1969) 
had daily instructional sessions of 2-1/2 hours with an average class 
size of about 15. Assuming one teacher saw two cohorts per day, and 
adding costs of room, mnterials, and supervision, places the cost of 
these programs in the $400-$500 per pupil range. 
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The Higher Horizons 100 Project In Hertford, Connecticut gives 
remedial language and Intensive counseling to 100 disadvantaged ninth 
grade students annually. The program has s.nall classes, considerable 
counseling and Individualized Instruction, wllii emphasis on remedial 
language Instruction. Program tota per-pupll costs are $900, which 
is perhaps $100 to $300 more than the per-pupll costs In most northern 

school districts (Wargo, 1971). 

Wargo (1971) discussed one preschool home instruction program 
similar to the Gray and Klaus, and Barbrack and Horton studies dis- 
cussed Just above. A trained "Toy Demonstrator" visited each mother- 
child combination twice weekly. The cost was $387 per pupil, not 
unlike the professionally staffed programs of Gray and Klaus, and 

Barbrack and Horton. 

Another remedial reading program discussed above was Project 
conquest In East St. Louis. Project children received remedial 
reading instruction In A5-minute sessions four days a week In read- 
ing rooms or twice weekly In reading clinics. Per pupil cost was 
$263. Another was Project MASS In Leominster, Massachusetts, where 
pupils spent 45 minutes dally In special reading classrooms at a 
cost of $300 per pupil. Another was the Remedial Reading Laboratories 
in El Paso, Texas, where pupils were taught In small groups of about 
eight pupils for 50 to 60 minutes each day. The cost in this program 

was $210 per pupil. 

Finally, the PS 115 Alpha One reading program In New York City 
used the commercially available Alpha One language arts program which 
makes reading and writing Into a game In which children participate In 
creative and dramatic play, etc. The average cost over three years 
appears to have been about $200 per pupil. 

glTMMARY OF FINDINGS 

The discussion of intervention programs in this section leads 
xis to the following conclusions: 

o Virtually without exception, all of the large surveys of the 
large national coag^ensatory education programs have shown no beneficial 



-125- 



rssults on avsmgs* HowsvsiTj tti6 evaluation reports on which the 
surveys are oased are often poor and research designs suspects 

0 Two or three smaller surveys tend to show modest and positive 
effects of compensatory education programs in the short run. 

o A number of Inten/entlon programs have been designed quite 
carefully and display gains in pupil cognitive performance, again in 
the short run. In particular, pupils from disadvantaged socioeconomic 
backgrounds tend to show greater progress in more highly structured 
programs. (Frograms that are highly structured are those in which 
the sequencing of the children's experiences is heavily organized 
externally .) 

0 There is considerable evidence that many of the short run 
gains from educational intervention programs fade away after two or 
three years if they are not reinforced. Also, this "fade-out" is 
much greater ^for the more highly structured programs, which are most 
unlike regular public, school practice. 

o It would appear that per-pupll costs of successful educational 
intervention vary anywhere from $200 on up, with the "feasible range 
for such programs falling between $250 and $350. However, numerous 
Interventions funded at these levels have failed. Clearly the level 
of funding is not Itsslf a sufficient condition for success., 
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TMTPnnufrriOH; the APPR0ACH _DEFI^ 

The experiential approach arises from recent renewed Interest In 
eohool reform, for the past decade a number of J 

Prledenhere (1963). Henry (1963). Holt (1964). Goodman » 9’ >' 

Herndon (1968 and 1971). K*1 (1968). Korol 
Sllberman (1970) . Postman and Welngartner (197 ) . ' 

assailed America's existing educational system on a variety g 
::pTt: r diversity among these and other — — - 7 ; 

.sn elements emerge, which we call the ex periential app ^- 

in effect that the 

students' 

— — — .".asaes end for the rest o f th^Uves. Therefore, t 

these auth'ors , the other approaches discussed In "‘'^"'’‘’^ialir 
output, process, organizational, evaluation - are all ^ 

irrelevant, unless they affect! (1) the student's concept about hlmse 

individual and as a amater of the society (classroom, school, com- 
IHy. and so on) that impinges on him, (2) the style that the student 

develops to deal with school experiences (not*ly 
student-student transactions), (3) the -tltudes toward 
tutlons that students develop as a consequence 0 

tutions that scu system, 

perlence with one such institution 

j «P«n that the reform writers believe that cognitive 

This doesn't mean that cne tcx , _ 

tJU«^ f-Viov eenerally do believe is that the 
aif-ills ate unimportant t What they g • 

r : r. 

»•’ - “ t; r. 

elusions concerning the value of cognitive skills, especially Glntls 

(1971). 
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These kinds of variables are also studied In the other approaches 
we have reviewed. In those cases, however, experimental (or statistical) 
controls were Introduced to eliminate confounding Influences and high- 
order interactions between variables that confuse data interpretation. 
The success of this venture was discussed in previous sections , but we 
note in passing that researchers have not been overly successful in 
applying valid experimental designs to social situations. The reform 
authors observe these variables in a completely unconstrained environ- 
ment, which allows more meaningful behavior to occur but makes data in- 
terpretation more difficult. The variables Interact in complex and 
unknown ways, and even worse, variables change as a result of inter- 
action with other variables. For example, teachers generally behave 
in a way that they perceive the system expects them to. This in turn 
fortifies the system's bias in that direction. 



The difference in viewpoint between conventional educational re- 
searchers and the reform writers Is shown when one considers the kinds 
of outcome measures endorsed by each. In Section II we point out that 
educational research almost exclusively uses standardized tests to 
measure educational achievement. We further point out that these tests 
measure only a fraction of possible educational outcomes. Of course, 
the extensive use of standardized tests for measuring the retention of 
bits and pieces of material, largely learned by rote. In part reflects 
the objectives of the school. The reform writers attribute little 
importance to performance on standardized tests and associated curriculum 
material. As noted above, they generally feel that it Is Important for 
children to acquire reading and math skills and so on, but they view this 
as almost an Incidental accomplishment within the broader objective of 

s tuden t achievement . 



Students can acquire skills in the pursuit of more meaningful goals, 
Illich points out one way that learning of skills might occur ( 1971 , 

p. 13 ): 

The strongly motivated student who is faced with the task of 
acquiring a new and complex skill may benefit greatly from the 
dl^lpllM now associated with the old-fashioned schoolmaster 
who tLght reading, Hebrew, catechism, or multiplication by 

idfi 
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rote. School has now made this kind of drill teaching 
rare and disreputable, yet there are many skills which a 
motivated student with normal aptitude can master in 
mattlfof a few months if taught in this traditional way. 

This is as true of codes as of their encipherment; of 
second and third languages as of reading and wr ng, 
and equally of special languages such as algebra, com 
outer programming, chemical analysis, or of manual skills 
like typing, watchmaking, plumbing, wiring, TV repair; or 
for that matter dancing, driving, and diving. 

Instead, according to. these writers, children are required to learn 
irrelevant "facts" and "skills," which they know are unimportant and 
which bore them and turn them off to meaningful learning. Postman and^^ 
Weingartner (1970) call this process the school game of "Let’s Pretend" 

(p. 49): 

The game is called "Let’s Pretend," and if its name were 
chiseled into the front of every school building n e^ca 

we would at least have an honest "f ^^ich 

nlace there. The game is based on a series of pretenses which 

include: Let’s pretend that you are not what you are 

that this sort of work makes a difference to your lives, let 

Tr^tend that what bores you is important, and that the more 

you are bored, the more important it is; let s 

there are certain things everyone must know, that both 

the questions and answers about them have been fixed 

tile- let’s pretend that your intellectual competence can be 

5idg4d oi the basis of how well you osn play Let’s Pretend. 

Standardized tests are viewed as relatively unimportant for a num- 
ber of reasons. To begin with, achievement on. standardized tests Is to 
a high degree a matter of sophistication about test taking and one's 
attitude about It. Besearch reviewed earUer on the effect of teacher s 
expectations Indicate test achievement Is not always an Indication of 
the mnount learned (for example, Rlst, 1971). Kohl (1968) also cites 
evidence that achievement test performance Is only IncldentaUy related 
to learning. At the very least, for achievement testing to be a vald 
and important measure of learning, the test must accurately assess how 
aiuch a student learns about some specific subject. As Holt (1968, p. 

135) notes: 

It beeins to look as if the test-examination-marks b^iness 

is a ligantic racket, the purpose of which is to enable students , 

teachLs, and schools to take part in a joint pretense that th 
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students know everything they are supposed to know, when 
in fact they know only a small part of it if any at all. 

Why do we always announce exams in advance, if not to give 
students a chance to cram for them? Why do teachers, even 
in graduate schools, always say quite specifically what the 
exam will be about, even telling the type of questions that 
will be given. Because otherxrlse too many students would 
flunk. What would happen at Harvard or Yale if a prof gave 
a surprise test in March on work covered in October? Every- 
one knows what would happen; that's why they don’t do it. 

Even if one could realistically measure how much a student knows 
about a specific subject, there are still grounds on which to question 
the relevance of what is learned. To quote Holt again (p. 177): 

We must ask how much of the S!jm of human knowledge anyone 
can know at the end of his schooling. Perhaps a millionth. 

Are we then to believe that one of the millionths is so 
much more Important than another? Or that our social and 
national problems will be solved if we can just figure out 
a way to turn children out of schools knowing two millionths 
of the total. Instead of one? 

Holt and others believe that it is more Important for students to 
learn how to learn, to solve problems, and to be curious than to acquire 
specific and mostly Irrelevant bits of information. The reform writers 
focus on those outcomes that Involve higher cognitive processes (abstract 
reasoning, creativity, problem-solving, and so on) and affective factors 
(self-concept, happiness. Interest, attitudes, and the like). Within 
the framework of these goals they feel that the basic skills can be 



The material reviewed in this section is descriptive research rather 
than analytical research and our comments on "limitations” are based on 
broader considerations than in previous sections. The social scientists, 
and especially social refom writers, attempt to cope with variables 
that defy precise and operational definition and generally are impossible 
to measure with an acceptable degree of reliability,, In general the 
social scientist is always faced with a choice of alternatives in ex- 
amining policy issues. On one hand he may be experimentally and methodo 
logically rigorous, but he is then limited to studying only simple or 



developed. 



limitations of research 
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j ui mo These even if solved, may have little 
highly constricted problems. The , 

relevance to "real life” P-bl^. tLridlquately applies. 

rr. . ...a.. 

on the basis of the eadence presented. Experimental researc 
fore often crltielsed on the grounds that It defines a problem «o nar- 
rowly; conversely, experiential literature Is often critic se 
grounds that it Is not methodologically sound. 

The nmst serious limitation of the "research" of the reform writers 
Ues In the necessarily Inconclusive character of the obtaine «su 
and the elusive nature of "proof." The influence of their fin 
pends upon their ability to convince. Even if the J 

vlncing to the reader. It is difficult to 

suggested, because the reform writers rarely present a detailed pre 
scrlptlon for moving to the better learning environment they enaslon. 



results 

,1 ^r. hhls section are in terms of the opinions 
The results presented in this sect 

u A focus on issues where they show a general con- 

of the authors, with a focus on i their rrlti- 

senses. Some writers are relatively conservative and 11m t ^ 

. H nlan for reform to something that they believe to be feasible 

CIS” Pl®" soiitlcal structure (for example. Kohl, Herndon) 

:r :: Tetdrerr cir; .r bmad and extensive soaal ^ well 

as educational reform (for example “rare U eonslatlncy 

differences among authors are mostly of degree 

In their attitudes about educational reform. 

The results will be organised m.der three headings: social values 

< 1 nblcctlves the school environment, and reformation, 

and educatlona J . evidence overlap 

This the diagnostic Impressions 

irre'rrlters'are summarised, and In the third we consider «.elr pre- 
scriptive reco-endatlons. It Is Impossible In this brief sum^^ t 
g::lere than a rough Indication of the findings reported by these 

authors . <<f 
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Snri al Values and Educational Objectives 



Before we discuss the authors' views on educational objectives, we 
note that their comments are based on objectives Inferred from classroom 
activities and, therefore, there is little relationship to the idealistic 
jargon and lofty platitudes often found in curriculum philosophy regarding 
school objectives. For example, creativity is generally found to be one 
of the major objectives stated in curricula; but these writers would 
contend that there is little in the actual instruction, classroom acti- 
vity, or testing that is even remotely connected with "creativity. 



Each author, in his own way, questions the content and priorities 
of educational objectives, and they criticize the social values under- 
lying them. Frledenberg, Goodman, and I lllch present the most direct 
assault on the Influence of middle-class values - primarily conformity ■ 
on education, although more conservative writers. Sllberman. for example 
also make it clear that educational problems are not restricted to the 
schools but lie rather in the social and political values that determine 
educational practice. Although some authors do not make a major thesis 
of an attack on social values, it is nevertheless Implied throughout 
their books. Holt, for example, frequently points out that the submis- 
sion and subjugation of children begin in the home and continue in the 
classroom. He states (1968, pt 167) • 



We adults destroy most of the Intellectual and creative 
capacity of children by the things we do to them or m^e 
them do. We destroy this capability above all by mak ng _ 
them afraid, afraid of not doing what other people want, of 
not pleasing, or making mistimes , of falling, of being wrong, 
ttus we make them afraid to gamble , afraid to experiment, 
afraid to try the difficult and the unknown. Even ^en we do 
:^t creKe Sliaren's fears , when they^cpme to gars 

ready-made and built-in, we use these fea^s as handles t . 

manipulate them and get th^ to do what we want. 

Silberaan (1970) describes the schools '' tracing the 

cause repeatedly to social values and institutions. He states (p. U) : 

This ndndlessness r- the failure or refusal to think ser^ 

i^Lir*out educational purpose , the reluctance^ to question 
established practice— is not the monopoly of the public 

school; it is diffused remarkd)ly evenly throughout 

pdurational system, and indeed the entire society. 
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problem of policy-making in our society/' Henry A. Kissinger 
has said, "confronts the difficulty that revolutionary changes 
have to be encompassed and dealt with by an increasingly rigid 
administrative structure.... An Increasing amount of energy 
has to be devoted to keeping the existing machine going, and 
in the nature of things there isn't enough time to Inquire 
into the purpose of these activities. The temptation is 
great to define success by whether one fulfills certain pro- 
grams, ha/ever accidentally these programs may have been ar- 
rived at. The question is whether it is possible in the modern 
bureaucratic state to develop a sense of long-range purpose and 
to Inquire into the meaning of the activity." Kissinger was 
talking about the problems of government; he might just as weU 
have been talking about higher education and the mass media. 



and later (p . 36) ; 

Wliy the failure of the mass media? The answer is at once 
simple and complex. What is mostly wrong with television 
newspapers, magazines, and films is what is mostly wrong 
with the schools and colleges; mindlessness. At the heart 
of the problem, that is to say, is the failure to think 
seriously diout purpose or consequence the failure or 
people at every level to ask why they are doing what they 
are doing or to inquire into the consequences . 

The basic social value according to these writers is conformity, 
and the society Is geared to produce it as Frledenberg (1963, p. 11) notes 

The essence of our era is a kind of Infidelity, a disciplined 



expediency. 

This expediency is not a breach of our tradition, but its 
very core. And it keeps the young from getting much out of 
the dlversilfy that our heterogeheous culture might otherwise 
provide them. T^iiis kind of expediency is built into the value 
structure or evety technically developed open society; and 
it becomes most prevalent when the rewards of achievement in 
that society appear most tempting and the possibilities of 
decent and ^reseive suiviyal at a low or intermediate position 
in it least reliable. B^lng Afferent, notorl^^^^^ 
get you to the top. If individuals must brieve that they are 
on their way there in otder to preserve their self-ee teem 
they will be under constant pressure, initially from anxiotjs , 
aduits and later from their own aspirations, to repudiate the 
divergent elements of their character in order to make^it under 
the terms common to mass culture. They chpose the path most ^ 
traveled by, and that makes all the difference. , • 

He notes the: high success attained by society in promoting th^ 
acceptance (jf cpnf or^ty in q^dent^^ . >/ - • ; i: 
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1 . ♦- «or•^ fhpv do firmly and sincerely believe that 

For * ate ^th theL Immediate social order. 

^d^tLt people who don't are troublemakers who 
deservedly bad end. They are genuinely suspicious of, 

7o ueoDle who Insist on their own privacy and dig 
hostllB t f ^ P Thev are convinced that strong 

nity against |""P^J®^re'ha«rLus. and that it-ts not merely 
S^isfLt wrong to alow such “—‘“f % “ ^ ' 

need. 

Schools are social Institutions and as such they perpetuate the 
elues of the society. Although this may be understandable, and to some 
legree necessary, these writers (and many others) point out an riarm ng 
.lallarlty among schools; amost none provide an environment for^any 
dnd of individual »d creative growth. The gloomiest note Is the re- 

crain sounded by Henry (1963, p. 286): 

the spi^t . children should never escape Romo 

mind and ridicule, admonition, accusa- 

saplens has en>ployed praise, ridicui, 

tlon, mutilation, and even torture to _ ^ J course Homo 
ture pattern. Throughout most of his historxc coui:»^_ 
ooMpL has wanted from his children acquiescence, not 

sapiens h^ wanted r a ural that this ^e so, for 

*i?e"S^S'maJ ^ »Se there Is no society, end where 
vdiere every _ ^ Contemporary 

there Is no «L.^"creatlve children, yet 

American educators think ev«ect these children 

It is an open ^ clLsrooms^ from kindergarten 

to create. they expect It to happen, are 

reason that were young people truly creativ ^ _ont from 

^al P • ^ fthd what is ^ven Is the culture Itself . From 
whet :‘relt^nours'' of kindergarten to the 

S:t ^srr^rpiohtms m soclolcgy 

tlcn of cducarion is to_^pre«nt blologl- 

from getting out of hand. ; ; ^ for we have (but 

sel of the social system. ^ 
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The Learning Envlroninent 



Social values and educational objectives are expressed In the struc- 
ture of schools and classroonis, In what is u^ten termed the learning environ 
ment. It Is not surprising* therefore, that most of the writing about 
the ills of education and need for reform centers around the learning 
environment. There is a long history of criticism (pointed out by 
Sllberman, 1970) of the overly structured, authoritarian, and strictly 
disciplined classroom, and all of the writers we are reviewing here 
advocate less formal classrooms. The authors agree in describing schools 
as boring and prison-like in character; a feature that exists not only in 
poor ghetto schools but also, usually in more subtle form, in middle-class 
schools. Thus: 

Postman and Weingartner (1969, p. 155): 

City s‘cliools as they now exist largely confine students 
to 'it ting in boxes with the choice of acquiescing to 
teacher demands or getting out. 

Herndon (1971, p. 97): 

If kl(^s in America do not go to s^ool, .they can be put 
in jail. If they are tardy a certain niimber of times, they 
may go to jail. If they cut up enough, they go to jail. 

If their parents do not see that they go to school the 
parents may be judged unfit and the kids go to jail. 

(p. 9fj): 

as long as you can threaten people, you can't tell whether 
or not they really want to do what you are proposing that 
they do. You can't tell If they are inspired by it, you 
can't tell if they learn anything from it, you can't tell 
if they would keep on doing it if you weren't threatening 
them. 

You cannot tell. You cannot tell if the kids want to come 
to your class or not. You can't tell if they are motivated ^ 

or not; You can' t tell if ' they learn anything or not , All 
you cw tell is, they'd rather come to your class than go 
to jail, 

Holt (1970, p. 68): ^ 

Boredom. Almost all children ar.» bored in school. Why shouldn't 
they be? We would be. The children in the high status and 



"creative" private elementary schools I taught in were 
bored stiff most of the day — and with good reason. 

Very little in school is exciting or meaningful even to 
an upper middle-class child; why should it be so for slum 
children? Why, that is, unless we begin where schools 
hardly ever do begin, by recognizing that the daily lives 
of these children are the most real and meaningful and 
Indeed the only real and meaningful things they know. 

These writers maintain that schools are too highly structured and 
too much committed to controlling and disciplining students, not only 
in the classroom, but in the hallways, and on the playground, and around 
the school. The school tells them where to go, what to do, and how to 
dress and provides an endless list of rules involving trivial and petty 
restrictions. The final travesty according to these writers is that in 
this environment teachers tell them about democracy , and individual 
freedom, and responsibility, and all the other lofty ideals that every 
day the school flagrantly violates. These restrictions are imposed 
immediately by teachers, and more remotely by school administrators, 
but ultimately by parents and society. Teachers and school administra- 
tors are themselves severely limited in the freedom they can exercise 
in teaching strategies and administrative arrangements, although more 
freedom is available than they use. Kohl describes the feelings of a 
teacher in the bureaucratic structure (1969, p. 11): 

When I began teaching I felt isolated in a hostile 
environment. The structure of authority in my school was 
clear: the principal was at the top and the students were 

at the bottom. Somewhere in the middle was the teacher, 
whose role it was to impose orders from textbooks or 
supervisors upon the students. "Hie teacher's only pro- 
tection was that if students failed to obey instructions 
they could legitimately be punished or, if they were defiant, 
suspended or kicked out of school. There was no way for 
students to qxiiestion the teachers' decisions or for teachers 
to questions the decisions of their supervisors or authors 
of textbooks and teachers' manuals. 

Teachers are too busy controlling children, following inappropriate 
curricula, trying to pleMe parents and^^^^^^^^ along >rith ^chool ad^nl-stra 
tors to have much time available for teaching. Although the teacher is 
generally the focal point for criticism of spools, the teachers are 
also victims of a system over which ;they have little control . Kohl 
describAis the position of teachers Xl969, p. 89): 

v‘154 
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Supervisors deal with teachers in the same way they 
expect teachers to deal with students. They are usually 
more interested in avoiding problems and maintaining con- 
trol than in matters having to do with teaching. As far 
as they are concerned the content of the curriculum has 
been undated by a Board of Education or a curriculum 
committee, and it is the teacher's role to follow the 
curriculum. A good teacher, like a good soldier, is one 
who obeys orders. An excellent teacher is one who obeys 
them cheerfully and willingly. 

Silherman comments at some length on the dilemma facing the teacher, 
and at one point states (p. 320); 

Indeed, given the obsession with silence and lack 
of movement that so many principals, superintendents, and 
curriculum supervisors show, Vind the fact that teachers 
ability tends to be judged more on their "control than 
on any other attribute, it is essential that someone be 
available to relieve teachers' anxiety d)out what thexr 
supervisors may say if they see children talking or 
moving about in class. Teaching, after all, is a very 
lonely profession. 



The reform writers point out from time to time that teachers are 
not basically bad or cruel or disinterested in their students. Mostly 
they do what they are forcad to do by the structure of the school, and 
many times their behavior ie simply the result of not knowing, or more 
often because they are products of the same kind of system in which they 



are teaching. Whatever their motives , however, teachers serve as a 



model and as wardens in the education of children, and for the most 



part the results are not favorable as Goodman notes (1970, p. 78) : 



As Gregory Bateson has noticed with dolphins and 
trainers and as John Holt has noticed in middle class 
schools, learning to learn usually means picking up the 
structure of behavior of the tea:ohers and becoming expert i 
in the academic process. In actual practice, young dis 
coverers are bound to discover what will get them past the 
College Board examinations. Guessers and dreamers- are not 
really free to balk and drop out for a semester to brood 
and let their theories germinate in the dark; as proper ^ 
geniuses do. And what if precisely the Big I de 
truef Bins teih said tl^^ 

stupid. pedant for a teacher »so. that a s^rt child could 

ifight him all the way and .deyjBiop. his^^^ o^^ 



All teachers of course .do- not ga along wl^h the adninis trative 



doctrine, as viewed by the reform 'wriLters, and they attempt 

. ■ 
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in various degrees to deviate from accepted procedure. Successful 
education programs and teaching approaches are sometimes reported, 
especially in schools where students are largely "culturally deprived." 
These are always the result of deviations from standard procedure and 
involve independent action on the part of the teacher. These devia- 
tions, however, are not generally encouraged by the school and often 
are not even accepted. Illich comments on the fate of inventive teachers 



(1970, p. 65): 
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The "classroom practitioner" who considers himself 
a liberal teacher is increasingly attacked f rom ^1 sides. 
The free-school movement, confusing discipline with indoc- 
trination, has painted him into the role of a destructive 
authoritarian. The educational technologist consistently 
demonstrates the teacher's inferiority at measuring and 
modifying behavior. And the school 
he works forces him to bow to both Summerhill and 
making it obvlo»is that compulsory learning cannot be a liber 
enterprise. No wonder that the desertion rate of teachers 
is overtaking that of their students.. 



Herndon (1968, 1971) is a classroom innovator who has managed to 
operate an "open classroom" in an otherwise conventional school. Kohl 
(1971) reviews Herndon's latest book ( How to Survive in Your Native Lan ^) 
and makes the following comment concerning his probable fate as an 
innovator (p. 11)! 



There is one problem however. Jim managed to su^ve 
in Daly City for nine years. He took six months off last 

wSn he retuiied In February he was told that there 
was no job for him at his old school until September. He 
was madi a roving sub iv. the district, one of bureau- 
cratic strategies used to drive good people out of teaching. 



Jim is going back this September but it is clear that 
he is no longer to be tolerated. The new administration 
of his school wants him out, the limits, of toleration 

having evidently been reached in. Daly City. I guess P®°P- 

S llSaiy bekito understand what Jim la doing and^^^^O 
that it is better to force him out than re-examine^ their 
lives . I don't know , hew much longer Jim will sp^ve Jn Daly 
City . I think the final irony of the book is that maybe not 

even Jim can survive in’ our native l^d; 



■ ir teachers “are ;in :h; web'ht:;||trol ^ 



i 



choice, so are administrators , for ho 



ftter what' their plans for 
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Innovation they must answer to political pressures and the demands of 
the community. Everyone considers himself an expert on education because 
he has been there , and notions about necessary and desirable educational 
practices are projected on the basis of personal experience. So in the 
end^we come full circle and find that it is society that determines 
the school practice. However, if the schools are persistent, change is 
possible, and we close this section with a note of optimism in this re- 
gard. Dennison ( 1969 , p. 7 ) found: 



It is worth mentioning here that, with two excep- 
tions the parents of the children at First Street were 
not libertarians . They thought that they believed in com- 
pulsion, and rewards and punishments, and 
clpllne and report cards, and homework, and elaborate 
school facilities. They looked rather askance at our^ 
noisy classrooms and informal relations. If they 
slsted in sending us their children, it was not because 
thej agreed with our methods , but because they were 
M Ae months went by, however.: and the children jAo had 
been truants now attended eagerly, and 
falUng now began to learn, the parents urew their 
concisions. By the end of the first year there was_ a 
high morale among them, and great devotion to the school. 



Beformatlon: Prescrl otion tor EducatiM 

There is a striking similarity in the prescription these writers 
offer for education, the differences are mostly a matter of degree and 
political feasibility - a matter we will not attempt to resolve here. 

The writers agree that at least part of the soluaon is to haye less 
formally structured classrooms in which the student <:an develop more 
or demands for conformity. 

The completely "open classroon" is one in which the student is 
allowed to wander around pretty much at will and to discover for hlm- 
, self the things he warits to learn. The British elementary school open 

■ elahsroom is often incorrectly used as the model for this approach. 

Liberman (W70) advocates ''inform^ tl«: die tlnctlon being 

that some minimal kinds of structure remain. His conclusions arc base 

:i . . teachers in the British (md some AMricM^ 



m 
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skills and appear happy with the learning environaient , result from the 
ability of the teacher to introduce structure in an unobtrusive way. 
Herndon (1971) describes an experience in an American school in which 
the open approach was tried, and failed at first. The children 
did not discover things they wanted to work on, nor did they develop 
group projects. Mostly they wandered around the halls and complained 
that there was "nothing to do." Finally, the introduction of a project, 
with some rather indirect structure, produced some of the student ac- 
tivity and learning that was expected in the open classroom. Herndon 
points out that the critical factor in making an open classroom work 
is the ability of the teacher to learn and adopt new approaches. 

Kohl (1969) has attempted to provide a guide to teachers for at- 
tempting open classroom techniques, a venture that is not easy, as he 
points out (p* 80) i 

The movement to an open classroom is a difficult 
journey for most of us. The easiest way to undergo it 
is to share it with one's pupils — to tell thOT where 
you hope to be and give them a sense of the difficulty 
of changing one's styles and habits. Facing uncertainty 
in oneself, and articulating it to one's pupils, is one^^ 
way of preventing a superficial bias "against authority 
wii-»ch if it fails, can lead one to believe that the 
open classroom just doesn't work. Freedom can be 
threatening to students at first. Most or them are so 
used to doing what they are told in school that it 
takes quite awhile for them to discover their own 

interests. Besides that, their whole school careers 

have taught them not to trust teachers, so they will 
naturally believe that the teacher who offers free- 
dom isn’t serious. They will have to test the limits 
of the teacher's offer, see how free they are to refuse 
to work, move out of the classroom, try the teaser s 
nerves and patience. All of this testing must be gone 
through if authoritarian attitudes are to be unlearned. 

^ A recent article oh the open classroom by Barth (1971) summarizes 

some of the issues, associated with conversion to the open classroom; 

In the final analysis, the success of a widespread 

movement toward open education in the 

upon agreement with any philosophical position but with ; 

satisfactory answers to: several linportant:,^ 
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For what kinds of pso?le ~ teachers, administrators^ 
and patents, children — is the open classroom appro 
priate and valuable? What, happens to children in open 
classrooms?" How can the resistance from children, 
teachers, administrators, and parents — inevitable 
among those not committed to open 7 

tions and practices — be surmounted? And finaiy, 

should participation in an open <=7®rr”stra^Ss’ 
of teachers, children, parents and administrators. 



The critical issue related to the open classroom concerns the de- 
gree of structure that is necessary for each student and how structure 
is Introduced. The same issue exists in the related subject of learning 
by discovery - a topic pursued at some length in the research liter- 
ature and reviewed earlier in this report. This research has produced 
a number of arguments against the traditional classroom, although the 
emerging consensus (see Shulman, 1967) is that learning by discovery 
does not mean laissez-faire learning, and that much needs to be known 
about how to introduce structure in the discovery or free learning 
situation. The research of Gagnd and Ausubel is particularly relevant 
to this point (see Section IV). They present evidence that Indicates 
some kinds of material are best learned if the subordinate knowledge 
Is arranged in a sequenced hierarchy. 

One purpose of the open classroom is to allow students (and some- 
times parents) a choice of activity and learning material , although 
structure is provided so that the goals are not completely left up to 
the student. Some writers have gone so far as to suggest that one 
should also be able to decide whether or not one goes to school, and 
if so, where and when and what one studies. This stand has been made 

specific by several, including Postman, Friedenberg, and especially 

Illich. Friedenberg (1963, p. 249) comments ; ^ , ; 
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Basically, then, I disapprove of compulsory 
attendance in itself. I see no valid moral reasons to 
sinele out the young for this special legal encumbrance . 

The^economic reasons are compelling enough; but they ^ 

likewise contemptible . A people have no right to cUng^^ ^ 
if ^»^r«rLgemants that can be made halfway workable 
“l^Tlm^okg L infantile and un^^ 

adolesienw Wd. indootrln^^ 

goods and shallow, meretricious relationships that they 
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and Goodman (1970, p# 67); 

The present expanded school systems are coercive In their 
nature. The young have to attend for various weU known 
reasons none of which Is necessary for their well-being 
or the well-being of society. 

Illlch (1971, p. 9) states: 

Obligatory schooling inevitably polarizes a society; 

It also grades the nations of the world according to an 
International caste system. Countries are rated like 
castes whose educational dignity Is deterinlned *e 
werage years of schooling of Its citizens, a rating which 
Is closely related to per capita gross national product, 
and much more painful. 

and later (p . 12) s 

A second major Illusion on which the school system 
rests Is that most learning Is the result of 
Teaching, It Is true, may contribute to certain kinds 
of learning under certain circumstances. But most 
people acquire most of their knowledge outside school, 

Ld^^ln school only Insofar as school. In a few rich 
countries, has become their place of confinement 
during an Increasing part of their lives. 

Although the Idea of non- compulsory education Is extreme It Is 
not haphazard and is presented with jolting logic, especially by 
Illlch, who argues that compulsory education Is not personally re- 
warding, socially desirable, or economically feasible. Illlch and 
others have pointed out the escalating costs of education, most of it 
tied to the futile quest for equal schooling. Equal opportunity for 
education as a political Issue has been distorted to mean everyone is 
equally educable. at least to the extent that school children all per- 
form "at grade level" on standardized tests of arithmetic and reading. 
The billions of dollars poured Into compensatory programs have not pro- 
duced any of the sought-for Improvements In skills for the so- 

called disadvantaged child, esperiall^ “oney 18 spent In con- 
ventional classroom imdcur^ ^ every respect 

educational costs Arer ^creasing a^ eq^^ 
nomlcany Infeasttle. ^‘Illi^ 
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In the United States it would take eighty billion 
dollars per year to provid'i vdiat educators regard as 
equal treatment for all in grannnar and high school. 

This is. well over twice the $36 billion now being spent. 

Independent cost projections prepared at HEW and the 
diversity of Florida indicate that by 1974 the com- 
parable figures will be $107 billion as against the 
$45 billion now projected, and these figures wholly 
omit the enormous costs of what is called "higgler 
education," for which demand is growing even faster. 

The United States, which spent nearly eighty billion 
dollars in 1969 for "defense" including its deploy-- 
nent in Vietnam, is obviously too poor to provide 
equal schooling. The President's cpminittee for the 
study of school finance should ask not how to support 
or how to trim such increasing costs, but how they 
can be avoided. 

Althou^ the economic arguments are compelling, these authors are 
not primarily concerned about the dollar cost of compulsory schooling. 
The reform writers are basically concerned with individual happiness 
and the construction of a society in which each individual can find 
useful and gratifying activity. In their view, conpulsory schooling 
has produced an emphasis on amount of schooling as a measure of com- 
petence rather than one's skills or knowledge. Those who might find 
gratification in trades and crafts are required to complete, a specified 
number of years of school even though the skills they acquire (or don't) 
are not applicable. Schooling dulls intelligence and perpetuates a 
social caste system based on wealth. Upward mobility in the caste sys- 
tem is discouraged by the schools even thou^ there is a mistaken notion 
that more schooling will produce a wealthier person and higher quality 
of life. The evidence, although still not conclusive (see Section II) , 
indicates that such eaq>ectations are false, and they are certainly 
false if broad definitions of "quality of life" are employed. Illich 

notes (p.^ 1) > 

Many students, especially those who are poor, in- 
tuitively know what the schools do for them. ; They school , , 
them to confuse process and 8ub®,tance. Once these be- 
come blurred* e new logic is assumed* the more' treatment 
there is, the better are; the results; or, escalation 
leads to success. The pupil is thereby "schooled" to 
confuse teaching with learning, grade advancement with 
education, a diploma with competence, and fluencj' 
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the ability to say something new. His imagination is 
"schooled" to accept service in place of value. Ifedl- 
cal treatment is mistaken for health cafe, social work 
for the improvement of community life, police protec- 
tion for safety, military poise for national security, 
the rat race for productive work. Health, learning, 
dignity. Independence, and creative endeavor are de- 
fined as little more than the performance of the in- 
stitutions which claim to serve these ends, and their 
Improvement is made to depend on allocating more re- 
sources to the management of hospitals, schools, and 
other agencies in question. 

All of the reform writers are subject to one sweeping criticism, 
and that is they do not provide any sort of blueprint on how to accom- 
plish the. school and social reforms they advocate. Their diagnosis 
of problems in education is sharp, and often quite valuable, but their 
prescrlptiotw are vague. Certainly reform is difficult to bring about, 
but to succeed at all, specific and detailed programs for implementa- 
tion are needed: Etzlonl (1971, p. 87) criticizes Sllberman on these 

grounds : 

Over the recent decades our ambition to fashion 
society in the shape of our values has swollen. We no 
longer accep>t society as a given, as a pre-existing 
state of nature. We view it as an arrangement , one 
which we can disassemble and then rearrange. We seek 
not merely to reform but to transform the relations 
among the races , the classes, the nations} we seek to 
deeply affect people's smoking, drug use, drinking, 
and eating h^lts, as well as to fundamentally change 
their education . Our econondc, political, and intel- 
lectual capacity to affect these changes has Increased, 
but much more slowly than pur ambitions. We are not 
learning, as recent discussions of the "peace dividend" 
indicated, the full measure of this disparity between 
ambition and resources. Even if the war is finally 
terminated and the SALT talks do succeed, there ap- 
parently will be available only $iS to $20 new billions 
per annum for domes tic ref orms , which require at leas t > 

$60 to $100 billions. As a nation, it seems we are 
much more inclined to talk reform than to display the 
political will required to bring it *out. In those 

McCracken (1970) has of fered ths, most compr^ensl^^^ 
of the reform literature; essehtiaiiy ’b theise grounds. Ee upholds 
the classical , educational values and argues that the reformers ' 
prescriptions are essentially nonop eratl on al or positively harmful. 
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domestic sectors where the nation does find the will 
and the resources, it frequently lacks the necessa?^ 
know-how. The knowledge and skills needed to provide 
a viable plan for social engineering are still rudi- 
mentary. Frequently we are still guided by well-meaning 
but inadequately conceptualized 

blueprints, by semi-utopian programs of which Silberman s 
book is a recent example. 

CONCLUSIONS AND POLICY IMPLICATIONS 

Educational rasearch has not produced Inpres 8 ive ln 5 .r 0 ven.ent 8 in 
education, and the results of conpensatory programs have been often dis- 
illusioning. It was pointed out earlier in this report that theie 
Uttle probability of significantly improving classroom performance 
through the development of new instructional techniques, more educa- 
tional er^endltures . or changes in the bureaucratic structure of the 
schools, ^ven the present limitations of knowledge and current in- 
stitutional constraints in the school systems. The reform writers pro- 
vide a range of observations, which in their eyes help to explain the 
failure of past educational innovation: (1) schools and research focus 

on unimportant objectives; (2) for mtay students learning cannot take 
place in an authoritarian environment because dilldren's needs and 

abilities differ; (3) the/subptance of educational practice is largely 
irrelevant and boring to the child; (4) children should not necessarily 
be required to attend school. 

The refoim wiithM ard often and they are 

therefore closer to«^ Che typical , 

researcher. Of dburse. these writer ”<>t represen^atlyq of 

teachers in geieraiv and there; are iopposli^ Views; ^^1^ 

form writers are at all correct there njust be : a large opposing view . 

namely the widely ;held ;»d socially 

formlty. HowevdrKohe views expressed for the 

current system ^at are based bn a diagnosis of What goes on in class 

rooms. It is in the role of diagnosticians that the reform writers 

and pth4i^;bbM;^r8;;cii''p^^^^ 

on correct diagnosis if it is to come up with correct 
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If these writers are correct — and there is sufficient agreement 

on these issues to give their ideas some credibility -- then the kinds 

of variables that researchers generally manipulate are indeed irrelevant 

or at least of small importance. It is not surprising in light of^all 
this that the most promising trends in research are related to student- 

treatment interaction and student-teacher matching in terms of their 
ability to work together (Thelen, 1967). What is needed now is a mar- 
riage of the diverse approaches of scientific research and the observa- 
tional diagnoses of the reform writers. 

It is imperative that sweeping innovations be attempted, at least 
on an experimental basis. But the steps must be carefully planned, the 
consequences considered, and the implementation proceed along carefully 
designed paths. Title I and Project Head Start are probably not the way 
to implement new programs, which need to be smaller in scale, more com- 
prehensively planned, and constantly monitored to avoid bad results. 
However, any major social action program is bound to produce highly 
disturbing transient effects and these too need to be planned for in 
the .implementation program. Finally, no single innovative system can 
succeed all along all the dimensions of everyone's value system. Dis- 
appointments are inevitable. But the quest is not for perfection. It 
is for progress toward a more effective educational system. 




■:i \ I':’ ■' 
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VIII. CONCLUSIONS 
S UMMARY AND PITCUSSION OF F INDINGS 
The Input-Output Approach 

This approach focuses on the relationship between the amounts of 
various resources that are provided to students and their educational 
outcomes (defined as cognitive achievement). Overall, the input-output 
studies provide very little evidence that school resources. In general, 
have a powerful Impact upon student out/-.omes. When we examine the re- 
sults across studies we find that school resources are not consistently 
important. The particular resources that seem to be significant In one 
study do not prove to be significant in other studies that Include the 

same resources in the analysis. 

Background factors, on the other hand, are always important. In 

study after study a student's background has a strong Influence on his 
educational outcomes. Furthermore, the results are consistent across ^ 

studies. The socioeconomic status of a student's family - his parents 

income, education, and occupation - Invariably prove to be significant 

predictors of his educational outcome. 

Tf.e' role of peer-group influences is more complex. There Is good 
reason to believe that these variables are, in reality, measures of a 
student's background or of his school district's selection and assign- 
ment policies, bn balance, there is little evidence that a student s 
classmates exercise a strong, independent influence on hls educational 

outcomes. 

The results from the lhput-o.v€put approach do not mean that school 
resources, fall, actually or potentially, to affect student outcomes. 

We simply observe that so fit these studies have failed to show that 
sdiool resources do affect student outcomes. In particular, the studies 
do not show what woijlS happen if the educational system received a 

massive increase /:>r decirease in resources. 



The Process Approach 



The rpproach of the psychologist focuses on a very different as 
pect of education. Resources are taken as given or predetermined. What 
matters here are the processes applied to students and the interactions 
between teachers and students. For example, research may concentrate 
on the relation between teaching style and student achievement, or on 
the effects of grouping on achievement. 

We have divided the results into two parts: those derived from 

studies of operating classrooms and those derived from the laboratory. 
For each set of results, we indicate the focus , the questions being 
asked, and the answers to the questions. 

Looking first at the classroom studies, we find the following: 
o The research on teaching approaches , teacher differences, 
class size, and the like shows no consistent effect on 
student achievement, as measured by standardized cogni- 
tive tests./ 

o Work on instructional methods suggests no difference among 
methods ; none currently appears better than conventional 
methods . ' That is, in terms of differences in achievement, 
conventional methods appear as effective as, say, teaching 
bv television, although the latter enables one to reach 
far greater numbers of students. 

We consider the following results from the laboratory studies to 
be particularly interesting and Important: 

o Work on the presentation of material suggests that it is not 

sequencing and organization. There seem to be interaction 
effects; individual methods of presentation appear superior 
for some tasks and some students, but it Is still hard to 

characteristics, tasks m type of ins^ctlpn. 

o The work on concept attainment, retention, and ].earning ’re- 
wards provides a number of positive findings , but the tasks 
in the laboratory are so unlike classroom learning that there 



is a difficult problem of translation. For example, the more 
meaningful the material, the faster It Is learned and the mere 
It Is retained. But the definition of "meaningful" Is a 
laboratory one, relating, say, to the difference between non- 
sense sentences or syllables and those that make sense. 

o What are termed Interaction effects seem to exist among var- 
ious types of personality, methods of reward, ability to 
grasp meaningful material, and so on; but these interactions 
have not yet been studied in detail. 

In sum, the process approach has not identified the very specific 
student relations Involved in learning and education. There seem to 
be interactions between students and teachers, betv/een students and 
methods, between teachers and methods, and (most complex of all) among 
students, teachers, and methods. The complex three-way interactions 
have not yet been studied carefully. 

The Organizational Approach 

The work on educational organizations represents yet another ap- 
proach. Schools are seen as Institutions that have to satisfy multiple 
goals and demands from internal bureaucracies, from the community, from 
parents, and from students. The allocation of resources and the choice 
of processes in schools is seen not as the result of a rational decision- 
making procedure liut as the outcome of history, of interactions with 
constituents «uid with government, and of simple trial and error. The 
question teing asked is. How can we make the schools innovatlv^^^^ 
tive, and flexibie, particularly as social demands increase and thq cow^ 

position of the student body chahg^^^^ > 

jf^.t of the work in this ^p^pacl^ c?ns^i^ 

the rules for internal and external validity a^^^ Further- 

more , there have been few attempts to extract Important organizational 
propositions from the literature . The case studies provide some e^- 
'dence 'for'''the' foii'dwihg: ^ ' '■ • \ 

o There is a positive^ correlation te tween system and 

centralization. 
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have reviewed their data to "see what happened" and "discovered" -.that 

inAHiod. or whatev6r# 
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o The larger the educational bureaucracy and the more cen- 
tralization, the less innovation and adaptation there is 

likely to be. 

o Rigidities in the schools can be overcome partly by choice 

of teachers and principals. However, teacher qualities 

that are purchased — say, experience — have little to do , 

with innovative teaching. 

o Real innovation depends on the leverage that can be exerted 
from outside the system -- by the federal govemmert or by 

citizens. 

The Evaluation Approach ' 

This approach to educational research eonslsts of ex post analyses 
of comprehensive Interventions In exlstlnf; school systems. These studies 
ate chatactetlxed by a macro-view of educational Interventions In which 
treatments are devoted to groups of children In "diverse programs taken 
as a whole." In short* these studies ask whether large-scale Interven 
tlons have had an effect In general, rather than what has been the ef- 
feet of any particular intervention. 

virtually without exception, all the surveys of large, national 

compensatory education programs have ahown no beneficial results on 
average . However, the evaluations on which the surveys report are 
often based upon suspect research designs. \ , 

Two or three smUler surveys show modest positive effects of com- 
pensatory educatlm programs In the short Hulte 

carefully designed interventions display gains In pupil co^ltlve per- 
>f„rmanie ai^n, :i< the tobrt;^ 

vantaged soclbeconoi&c ba^^ s'*”" 

highly sabi&rai.tbgra& g&^Sr dierh Is considerable evidence . . 

that many of the ehort-run gains from educational interventpns fade 

away after two or three years if they are not reinforced. Also, th^ 

: ; "fade-o.it" :iu: phe i^e,:hi^ 

■ uhi«iareimost..«iltke:::regul^^ 'VAiV-vvi 
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been only sparingly put into practice, if at all. And there is cer- 
tainly a possibility that they may prove to be much less effective 



- 152 - 



rhe Experientia l Approach 

The experiential approach is represented by ti.e literature of edu- 
cational reform. The observer, either as researcher or ’ 

observes and describes the way that the experience 

the student in relation to himself, his peers, authority, and soci^ 
institutions. The measure, for these writers . is not educational 
comes, as indicated by standardised tests, but rather the effect of e 
school experience on people’s lives, where cognitive testing measures 

nothing. 

Because this literature Is one of social reform, it is not 
to the same tests of internal consistency as the approaches 
above. In effect there are two elements in this 

and prescription.^ The description of the schools as constituted 
present time almost invarlAly emphasizes a set of common themes: 

o Schools are authoritarian toward students, 
o Schools make little or no allowance for individual dif- 
ferences in learning styles and needs, 
o Schools focus on methods that stress rightness and wrong- 
ness in learning, thereby destroying Independence and 
creativity, as well as equippim! children poorly for t e 

o Schools impose a certain set of social. cull:ura . ^ ^ 

ethical views on their students . thereby imposing feelings 

? of inadequacy and resentment on those who share neither 

those views nor the traditions they imply, 
o Schools are mindless in the sense that they fail, in any 

/ operationally -eful way. t^ question either assumpti^ 

■V or ^the,rel^.ncvqf^,their^a^^^ 

a ■ ' ,tq' Children/ S .tieeda^ ’ 



4he writer ' s triple role ^ objeSiviS^i^^^^ 

critic necessirlly^places he^ ^es^^^ there seems to be considerable 
tOie phenomena he observes. ; |i 

agreement in respect to description. 
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[ The prescriptions are far more varied than the descriptive i.e- 

^ search. They range from recommendations for moderate reform within 

I the system (Silberman) to abolition of the schools (Illich). In some 

I cases the value systems leading to the prescriptions are made explicit, 

I in others, not. In general, however, the experiential literature agrees 

I on the merits of educational systems that are less structured, more re- 

I sponsive to individual diversity, and more decentralized then the cur- 

I rent system. 
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L IMITATIONS OF AVAILABLE! RESEARCH 

Each approach is subject to substantive and methodological prob- 
lems peculiar to itself. These problems were discussed in other 
sections and will not be reviewed here. However, some research limita- 
tions appear throughout educational research and have, we feel, special 

importance. 

First, educational outcomes are almost exclusively measured by 
cognitive achievement. But the educational system has many functions 
and many outputs. Cognitive achievement, in particular that part 
measured by standardized tests, is only one aspect of student learning. 
Higher cognitive processes (abstract reasoning, problem solving, and 
creativity, among others) are obviously important educational outcomes 
as is noncognitive achievement. Thus, of the many and diverse kinds 
ot student learning, almost all the educational research that examines 
student learning is based on a narrow range of cognitive slcills. There- 
fore, ctirrent research cannot lead to conclusive generalizations about 
educational outcomes, because it cannot measure most of them well. 

Second, there is virtually no examination of the cost impUcations 

of research results. By and large, educational researc^^ have con- 
centrated on discovering effective educational practices., y no 

attention has been paid to the notion of cost-effective educational 
practices. Research results are thus difficult to translate into 
policy-:relevant statements. 

Third, few studies maintain adequate controls over what actually 
goes on in the classroom as it relates to achievement. Data on 
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classroom transactions are the only source of information on the content 
of the student- teacher relationship. Studies that omit transactions 
data can hope to identify only broad associations among variables that 
hold no matter what might be the nature of the relationship between 
student and teacher. Thus researchers' results may well be affected by 
circumstances unrecognized in their analyses. 

Finally, the data used by researchers are, at best, crude measures 
of what is really happening. Concepts such as a teacher's ability to 
teach or a student's ability to learn are easily discussed, but ob- 
jective measures of these abilities are extraordinarily elusive; and 
empirical analysis is based upon measurement. There is no way of 
knowing the extent to which inconclusive results stem from the re- 
searcher's inability to measure the variables he includes in his 
analysis accurately. 



CONCLUSIONS AND POLICY IMPLICATIONS 

With the limitations of research clearly in mind, we return to 
the issue of educational effectiveness. The first inajor toplleatlon 
of the research is: 



Reeeardh has not idenUfied a vaHant of th^^sting 

that is ocmsistently r^eUted to studen^^duaat^onal outcomes 

The term "a variant of the existing system" is used to describe the 

broad raiige of alternative educational practices that have been re- 
viewed d)ove . We specifically include changes In school resources , 
processes, organizations, and aggregate levels of funding. 

We must emphasize that we are not suggesting that nothing makes 

a difference, or that nothing 'Vorks.'' Rather, we are saying that re- 
search has found nothing that consistently and vmambisuous l y , makes a 
difference in students’ outcomes . The literature contains numerous 
examples of educktionai p^ to have significantly af- 

fected students’ outcomes. ^ The problem is that there are invarld)ly 
other studies, similar in approach and method, that find 

National practice to: be inef fee 



1 :- 



^See Section VI. 



- 155 - 



a practice that seems to be effective in one case is apparently inef 
fective in another. 



We must also emphasize that we are not saying that school does 
not affect student outcomes. We have little knowledge of what student 
outcomes would be were students not to attend sdhool at all. Educational 
research focuses on variants of the existing system and tells us nothing 
about where we might be withovt the system at all. 

Furthermore, nothing we have found in the educational research 
literature proves that our current educational system cannot be sub- 
stantially improved. But the research results we review above provide 
little reason to be sanguine. Our general conclusion, so far, is that 
there are few consistent, positive, policy- relevant findings. That is, 
the research offers little guidance to what educational practices 
should be implemented. This condition can arise because that is the 
way the world really is, or because researchers have been asking the 
wrong questions, or because the research methods used are not suffi- 
ciently powerful, or because the data are "bad." For whatever reason, 
we can only say that the educational practices examined thus far are 
only weakly connected to student achievement. 

Finally, the educational practices for which school systems have 
traditionally been willing to pay a premium do not appear to make a 
major difference in student outcomes. Teachers' experience and teachers 
advanced degrees, the two basic factors that determine salary, are not 
clearly related to student achievement. Reduction in class size, a 
favorite high-priority reform in the e^res of many school systems , seems 
not to be related to student outcomes. In general, the second majpr im- 
plication of the research (and the most important one for school fin^^^^ 

is:- 




InoTeoiB ing ei^stidi twras ( 
ie not likely to improve 



. traditioncLl eduoattoncii piKusti^oee 
e^oationdi outeomee eube tmtially . 




The third maj or policy implication of the research is 

. Theve. oe&n tp he oppox^tmiti^e 
\ or i^diTdeation of eduoaHbnal expenditdrep^^.^ l^out ob- 
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Researchers have examined many variants of the existing educational 
system. As we have indicated, none of these variants has been shown to 
improve educational outcomes consistently. A fact often overlooked is 
that few have been shown to lead to significantly worse outcomes either. 
Consequently, educational research has provided a long list of equally 
effective variants of the existing system. And, if these variants are 
not all equally expensive, then choosing the least expensive provides 
opportunities to redirect (or even reduce) costs without also reducing 

effectiveness.^ 

Educational research consists almost entirely of effectiveness 
studies. There are very few cost-effectiveness studies. The tremen- 
dous volume of "negative" results - negative according to the peculiar 
bias of educational research, which seeks only Improvement on the ef- 
fectiveness side — must surely contain many "positive" results in the 
sense of indicating less costly methods of accomplishing as much as is 

currently attained. 



The research contains some evidence supporting a fourth major 



finding: 

Innovation, responsiveness, and adaptation in school 
systems deorease with size and depend upon exogenous 
shocks to the system. 

In other words, large Systems are less likely to be innovative, respon- 
sive, or adaptive tiiaiv are small s|^^ Further, whatever the size 

of the system, innovation is not ai»t to come from within the system. 
Outside pressures, from the community or from the federal government, 

are iikely to be needed. We note, howe\|er, that relatively little 

research has been directed toward these issues . Hence , this finding 
must be viewed as tentative. 



The implication of this tentative conclusion is clear. There is 
currihtly a pod deal of interest in federal leverage pd in the ques 
tion of whether federal aid to i the schools should be tpd or untied. 



Mis conclusion applies only to questions of f 
tiveness as now measured. It; cannot be applied, to justify sitimt 
in which constant or decreasing expenditures would impair the healt 
or safety of children and staf?,. 
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The literature that we have examined suggests that federal Influence 
Is ilnnortan^ In getting innovation Into urban school systems, although 
the hypothesis has not really been tested rigorously. 

Our review of educational research supports a fifth tnajor finding. 

Educational reeearoh is seriously deficient in terms^ of 
the size, soopcj and focus of research efforts and in the 
integration of research results. 

Beyond these specific limitations, educational research has tended to 
be small in scale, narrow in scope, diffuse, maldistributed, and lacking 
in focus. By comparison with other major sectors, the amount of re- 
search activity devoted to educational problems is surprisingly small. 

For example, the amount of resources allocated to agricultural research 
and development is more than four times as large, and health research 
is allocated more than 13 times as much. Moreover, educational re- 
search is ,a relatively recent development. Quantitative research on 
American education goes back to the work of Joseph Meyer Rice in the 
1890s; but significant levels of activity did not begin until the late 
1950s when first the National Science Foundation, then the Office of 
Education, began to fund a wide range of research activities. A com- 
parison of R&D communities by institutional affiliation shows that edu- 
cational research is very unlike other R&D sectors in the economy because 
colleges and universities perform the majority of R&D in the educational 
sector. The academic community tends to conduct relatively small studies 
on a part-time basis and to concentrate on basic research. Furthermore, 
educational research has tended to be the almost exclusive domain of the 
psychologist. Only recently has It begun to attract the attention of 
more than a handful of well-trained researchers in other fields. 

The body of educational research now available leaves much to be 
desired, at least by comparison with the level of understanding that 
has been achieved in numerous other fields. This does not reflect the 
quality of the contemporary educational researcher, but rather the 

^See Levlen (1971) for a discussion of the current state of 
educational research • 
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nature of the research community and its history. The typical educa 
tlon study is not founded on a wealth of previous knowledge and under- 
standing not la it directed toward the needs of the educational policy- 
maker. There are virtually no research-based, problem-solving units 
in the typical operating agency. In 1968 there were only 1.300 man- 
years devoted to research, development, or innovation in the almost 
20.000 state and local education agencies; most of that was devoted to 
testing and to gathering statistics (Levien. 1971 ). 

Finally, the sixth major lipllcatlon of our work la: 

mearoh tentatively suggeete rtot 

outocmea, loth oogniUve and nonoc^vttve, tvqmre 
am^ng Oiangee in the organizatum, etruoture, md oon- 
duot of educational experienoea. 

This inference follows from the first four conclusions cited 
above, as well as from the testimony of the experiential approach. 

Even the fifth conclusion, which cites the paucity of educational re- 
search, tends to reinforce this point, because it Implies that marginal 
in research will be inadequate to point aear directions or 
educational Improvement. The next subsection offers hypotheses that 
are broadly consistent with the "sweeping chmige" Inference. 

unane nn UP. nfl FROM HESEl THE SBBSTAWIVE ISSUES 

Our review of educational research found little aasociatlon be- 
tween various educational practices - resources, proceaaes. organlM- 
tiona. and so on - «id students' educational outcomes. We also in- 
ferred reaeons why this seems to be so: the role of non-school factors, 

interaction, and inappropriate froms of education. Although they ave 
been recognised In the past by many educational reaearchets. they have 
not been carefuUy Investigated to any great extent. They are 
nntential explanations of why research ha. not rev.aed the expected 
connections between educational processes «id educational outcomes. 

Mnn-School FaCtOtS 

There is considerable evidence that non-school factors may weU 
be more Importmic determinants of educational outcomes than are school 

175 
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factorst The research repeatedly finds high correlations between stu- 
dents' socioeconomic backgrounds and educational outcomes. A variety 
of hypotheses as to why this relationship seems so powerful have been 

put forward. 

o At one extreme, there are some who argue that genetic differ- 
ences among children are associated with their racial, cul- 
tural, or social backgrounds. According to this view there 
are differences among children with respect to their learning 
ability, and these differences are. In turn, correlated with 
their environments. 

o Others have argued that environment Is correlated with educa- 
tional outcomes because much of the child's learning occurs 
outside of school. The child raised In an environment of 
poverty is seldom exposed to museums or libraries , lives 
In a home where few books are present , and generally Is not 
exposed to the variety of educational experiences available 
to the advantaged child. 

o A third and s;>mewhat related view also argues that much of a 
child's learning occurs outside school. What children learn 
outside school. It Is argued, depends upon what their environ- 
ments offer to them by way of experiences. Thus the child 
raised in an environment of poverty learns "as much" as a 
child raised In a middle-class family; but precisely what 
he or she learns Is quite different from what the middle-class 
child learns. However, this argmnent goes on, the tests or 
measures of educational outcomes are oriented toward the 
middle class and, rougjhly speaking, give full value to what 
the middle-class child has learned outside school but only 
partial credit to what the poor child has learned outside 

school. 

o Still others have argued that a child's background Influences 
his educational outcomes by affecting his attitudes. According 
to this view, the disadvantaged child lacks motivation or 
does not aspire to educational success. His parents are 

176 -- . ■ - 
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llkely to have attained relatively low educational levels. 

He faces racial and/or class discrimination which reduces 
his prospects of success (compared with the middle-class 
child) despite his educational attainment. The general 
thrust of this argument is that the disadvantaged child is 
not encouraged, either directly (e.g., by parental pressure) 
or indirectly (e.g., by observation of the "payoff" of educa- 
tion to others like himself), to seek success in the schools. 

The above are but a few of the many hypotheses as to why educa- 
tional outcomes seem to be ^jnaffected by variations in educational 
practices. They are not necessarily the most likely to be true, but 
they illustrate how a broad range of background factors may be adduced 
in asserting their domination of educational outcomes. 

None of this means that schools do not or cannot affect outcomes 
but it does imply that factors outside of the schools have a strong 
influence on students' educational outcomes, perhaps strong enough to 
"swamp" the effects of variations in educational practices. This is the 
important point: Are our educational problems scliool problems? The 

most profitidjle line of attack on these educational problems, under this 
hypothesis, may not be through the schools at all. But we have very 
little knowledge as to how and to what extent educational outcomes are 
affected by non-school factors. We can only observe that there is con- 
siderable evidence that non-school factors are closely associated with 
students* educational outcomes. The best information we have, regardless 
of the deficiencies we have noted, is that schools do not now have a 
tremendous impact on the achievement that does occur. Therefore, it is 
logical to infer that the whole substantive area of non-school learning 
deserves much more attention than it has received from past research. 

Interaction 

There is some (weak) evidence that the impact of an educational 
practice may be conditional on other aspects of the situation. Simply 
stated, this hypothesis argues that teacher, student, instructional 



^Scc, for example, Thelen (1967). 
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method, and, perhaps, other aspects of the educational process Interact 
with each other. Thus, a teacher who works well (is effective) with 
one type of student using one method may accomplish far less when work- 
ing with a different type of student, even if using the same instruc- 
tional method. Accordingly, the effectiveness of a teacher, or method, 
or whatever, may vary from one situation to another. 

We have discussed the notion of interaction at length in Section IV 
and will not repeat that discussion here. The important point to be 
made is, perhaps, that research has not discovered an educational prac- 
tice that is consistently effective because no educational practice 
always "works" regardless of other aspects of the educational situation. 
Interaction may explain why educational research has thus far failed 
to identify any educational practice that is consistently effective. 
There may not be. any universally effective educational practices. 

Thus far, teachers (or students) are generally viewed as inter- 
changeable within broad constraints. Educators voice concern if, say, 
a sixth-grade teacher is asked to teach the third grade, or if a science 
teacher is assigned to an. English class. But if a sixth-grade teacher 
is teaching sixth-graders, few have asked whether that teacher would 
be more effective If assigned to teach a different set of students. 

If interaction in fact exists, it may be possible to assign teachers 
to students so that each teacher (and student) is working with the 
particular type of student (and teacher) with whom he or she is parti- 
cularly effective. Thus, the concept of interaction should be viewed 
as not only a potential explanation of our inability to identify con- 
sistently effective .educational practice, but also as a prospective 
path toward improving educational outcomes. 

We must emphasise that we now know very little about interactions. 
There have been few attempts to examine interactions , and there is some 
controversy among educational psychologists as to whether interactions 
actually exist. Most of the evidence for the existence of interactions 
comes from ex post rationaliration of research results. That is, some 
researchers, confronted with unexpected results in their analyses. 



a4-tst O • Tl • It 



1 



i 

i 



178 



- 162 - 



have reviewed their data to "see what happened" and "discovered" -.that 
there was interaction among student, teacher, method, or whatever. 

The possibility that any given teacher may be more or less e/.fective 
when working with one group of students than when working with another 
is too important to overlook and is therefore /another priority field 

for research. 

THffercnt. Forais of Education 

Finally, there is a suggestion that substantial iaprovement in 
educational outcoioes can be obtained only tlirough a vastly different 
form of education. Those who argue this hy-pothesls question whether 
the system, as currently constituted in the Dnited States, 

can be substantially Inproved. It is seen, at the extreme, as being 
a bureaucratic, rigid, unresponsive structure that no amount of margi- 
nal change can improve. Both the organixational approach and the ex- 
perimental approach argue for this hypothesis. 

In some cases, critics of the system focus on the organization of 
the school's basic unit, the classroom. They argue that traditional 
Instructional practices fail to capitalize on children's natural 
curiosity and interest in learning. Team teaching, the use of audio- 
visual aids, and other instructional methods make little difference, 
according to these critics, so long as the child is forced to devote 
his attention to the teacher's choice of topics. Open schools, schools 
without walls, and the Uke are seen as being the solution. 

Other critics have found fault with the incentive structure in 
the sdicols. They argue that rewards and penalties are distributed 
among teachers and administrators according to implicit rules that 
emphasize factors unrelated to educational effectiveness. Those who 
share this perspective tend to argue for systems in which Incentives 
are directly tied to educational outcomes, such as voucher systems, 
nerfortcance contracting and so on. 

Research tells us Uttle about how effective these vastly differ- 
ent forms of education might be.^ They are novel systems that have 



^ut see Carpenter and Hall .(1971). 
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been only sparingly put into practice, if at all. And there is cei- 
tainly a possibility that they may prove to be much less effective 
than the current system. Large-scale experiments or demonstrations of 
these vastly different forms of education should be implemented and 
carefully observed and cfvaluated. 

WHERE DO WE GO FROM HERE? THE METHODOLOGIC AL ISSUES 

The policy results also raise another issue; What kind of re- 
search is now possible and worth doing? To begin, we consider this 
issue for each approach separately; then we raise the question of what 
is now needed to create real policy analysis for education. First, 
with re-soect to the input-output approach, only one of the studies 
analyzed (Hanushek. 1970) was able to match student achievement with 
resources — in particular, teachers — to which the students were 
actually exposed. (Ordinarily, student achievement is matched to 
average school resources). This study found that teachers make a dif- 
ference (for ethnic majority students), but it was unable to identify 
what qualities of a teacher make a difference. Thus, some research 
should be devoted to pushing this enterprise further. But this means 
that more resources will have to flott* into creating new data. None of 
the currently used and widely analyzed data sets — EEOS, Talent, 
Plowden — enables the investigator to matdi individual achievement 

with individual resources. 

For the process approach it is important to pin down the inter- 
action effects. This will require complex experimental designs. We 
also believe that it is important to work on translating promising 
research and development rssults into the operating classroom. This 
will mean a much closer scrutiny of the R&D experiments themselves 
and of the nmans of disseminating and evaluating results. 

I 

The organizational approach is one of the least rigorous and 
robust. The kinds of questions we want answered about educati'jnal 
organizations need to be expressed more clearly, and the sampling 
procedures need to be improved. A balance must be struck between the 
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In-depth richness provided by small samples and the generallzablllty 
provided by large ones . The organizational approach has a close re- 
lation to alternative flnsnclal structures. It Is hard to Influence 
the choice of processes or resources within schools or classrooms from 
outside without creating massive problems of control. It may be pos- 
sible to Influence educational organizations throng new financial 
schemes, but the organizational approach has as yet not Identified 
effective rnethods for applying that itii-luence. 

The evaluation approach Is the most policy-oriented of those con- 
sidered here. Tnereln lies Its greatest strength and also Its greatest 
weakness. In large program evaluations, across many Individual pro- 
jects, the basic question to be answered Is. to what extent was the 
program successful In general? Thus, large-scale evaluations tend to 
lump together Individually successful and unsuccessful projects to 
arrive at a general conclusion about program effectiveness. This 
general assessment provides an estimate of what would happen If the 
program were Implemented elsewhere, which Is extremely useful to know. 
On the other hand, large program evaluations ate seldom sufficiently 
detailed to explain why some projects succeeded while others failed. 

But this is, perhaps, the mote Important Information. Evaluators 
clearly must pay much mote attention to the differences between suc- 
cessful and unsuccessful projects within programs. If these differ- 
ences can be Identified and understood, the successful projects should 
be used as models for further wlthln-ptogram development. 

We feel that tlie books and articles that make up the reform liter- 
ature have provided insights rather than answers. These Insights 
Oist be checked, verified, refined, and extended. We need to develop 
methods of analysis that will allow us to distinguish the effects of 
the ways in which schools are organized, the way In which a particular 
school IS orgmilzed, and the personalities of a particular set of In- 
dividuals. Thus, an elementary school teacher may tell us that the 
children in her school are brutalized. We have to be able to determine 
whether or not this situation stems from the underlying structure of 
our odiools. Does the way In which we go sbout providing elementary 
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education "build in" incentives thet stimulate such treatment? Or, 
alternatively, does this situation come about because of the way a 
particular school or school district is structured? Or, is this 
behavior a function of the types of people that happen to staff that 
particular school? In short, the reform literature describes the path- 
ology of the schools. In this pathology idiosyncratic, or applicable 
to a wide range of schools? Can the prescriptions of the reformers 
be translated into operational planning and generalized to a wide 
range of schools? If they can, would their prescriptions be acceptable 
to the clientele of the American education system in other words, 
to all of us? There is, after all, substantial evidence that most 
Americans think the schools do pretty well now. If major increases 
in effectiveness require fundamental restructing of education, then 
effective reform might be unacceptable to the public even if costs were 

thereby reduced. 

We believe three things are needed in educational policy analysis. 
First , it will be necessary to merge the various research approaches . 

If economists want to fit educational "production functions." they 
^111 have to revamp the approach completely to include in their models 
specific processes and organizational factors that affect students, 
as well as interaction effects. The failures of the input-output ap- 
proach are, in fact, causing everyone to look more deeply at funda- 
mental assumptions about education. And so the economists find them- 
selves face to face with the psychologists and educators, being forced 
into a detailed analysis of what goes on in schools and classrooms. 
Second , we simply must measure education in relation to many more out- 
comes and dimensions (including time) than is currently being done. 
More resources must be devoted to designing new measures and instru- 
ments, and research will have to focus on outcomes over time. Organ- 
iiationally, this implies some permanent institutional arrangement 
that will keep the long-run research policy relevant. Thir d, cost 
considerations must be brought into analyses. We ar. ilmost certainly 
overlooking many opportunities to redirect scarce educational re- 
sources effectively and will continue to do so until a firm base of 
cost-effectiveneso research is built. 



II 
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We have consciously avoided any explicit discussion of the aims 
of education for two reasons: A study of those aims was not part of 

our charter; furthermore, since such a study would rely ultimately on 
personal valxies, the researcher is no more competent than any other 
citizen to solve these issues. 

Yet as James has said (1971): 

We have been notably unsuccessful as a society in this 
century in stating our aims of education. The prospect of 
allowing ourselves to be pressured by narrow concerns, driven 
by casual circumstances — like our rather uncritical embrace 
of "accountability" — to set trivial goals for our educational 
institutions is appalling. We desperately need, for the long 
range, not to preoccupy ourselves with the trivial, but to 
shape our goals to fit our broadest perception of the needs 
of human life, and to challenge our model-builders to reach 
toward them, and to be critical of failures to reach them. 

Our review here of what is known about educational effectiveness 
is a first short step to responding to that challenge, by identifying 
the limitations of our present knowledge and methods and pointing out 
possible paths toward improvement. The larger task set forth by James 
can only come from interdisciplinary efforts of an intensity, breadth, 
and continuity heretofore unknown, but not by that token unattainable. 

, 

! 
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Appendix A 
input-output studies 

IhlB appendix lays out In ao« detail the studies examined In the 
input-output approach - the first of the five approaches discussed In 
the text. In addition to reporting the results claimed for partlcuUr 
studies, we have made some effort to explain what each analyst did. 

For each of the 18 studies discussed here, the reader will find the 

following: 

o Author(s), title, publisher . . date 

o Unit of analysis: whether analysis was applied to 

schools or to IndlviduAl students 

o Sample size and description 

o Kinds of data 

o List of variables (all Independent and dependent 

variables are Included) 

o Procedure: what the analyst did as well as the 

tedmlques used 

o Results 

The studies are arranged in chronological order. 

■Wlllla. G. Mollenkopf and S. Donald Melville, A StuAi of Seoandcay 
S<Aool ChcwMtenaUoe as Related to Teat Soorea, Research 
Bulletin RB-56-6, Educational Testing Service, Princeton, W56. 

UNIT OF ANALYSIS 
School. 



SAMPLE 

(a) 100 schools (9,600 ninth graders), (b) 106 schools 

(8,357 twelfth graders). 
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DATA 

Independent variables were drawn from a questionnaire filled out 
by principals. Dependent variables were drawn from special tests ad- 
ministered in the schools at the request of the Educational Testing 

I 

Service (ETC) . 

VARIABLES 

Independent 

1. Number of school facilities (e.g., auditorium, gymnasium). 

2. Percent full-time teachers five or more years college training. 

3. Percent full-time teachers five or more years e 3 q>erience. 

4. Percent full-time teachers aged 36-60. 

5. Years of experience of principal. 

6. Degree level of principal. 

7.. Percent principal's time-supervision. 

8. Percent principal's time-administration. 

9. Number of special staff. 

10. Pupil/teacher ratio. 

11. Drop-out index [(a) 12th graders/lOth graders; 

(b) 9th graders/7th graders]. 

12. i^)A/nuinber pupils 7th grade or higher. 

13. Average class size. 

14. Public library in region. 

15. PTA manbers /number pupils 7th grade or hi^er. 

16. Percent graduates entering college. 

17. Percent support from state aid. 

18. Average teacher salary. 

19 . Supplies and library expendlture/number pupils 7th grade 
or higher. 

20. Percent fathers high school graduate. 

21. Percent fathers employed as professionals. 

22. Percent fathers employed as fanners. 

23. Percent fathers esqployed as craftsmen. 

24. Rate of growth of community, 10 years. 
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25. Size of connnunlty (urb an/ rural) . 

26. Number of pupils in school, 7th or hi^er. 

27. South or non-South. 

Data were collected for seven additional independent variables 
that were discarded a priori . 



Dependent 



1. Vocabulary test score. 

2. Sentence completlca test. 

3. Arithmetic reasoning test. 

4. Arithmetic comptitation test. 

5. English achievement test. 

6. Social studies achievement test. 

7. Science achievement test. 



PROCEDURE 

Questionnaires were sent to 1,877 hi,'^ school principals. Replies 
ware obtained from 844 (560 indicated willingness to administer tests). 
A stratified sample (by independent variables 3, 16, and 17, selected 
by factor analysis of Independent variables) was chosen from among these 

Mean aptitude teat scores (nunfcers 1-4) were calculated for each 
school. Independent variables were dichotomized near tihe median and 
correlated with mean test scores . Based on these simple correlations , 
six Independent variables (nunhers 14. 17, 19, 20, 25, 27) were chosen 
for further stiuiy. 

Parts 1 and 2 of the aptitude test were combined to obtain a ver- 
bal score. Parts 3 and 4 were conbined to obtain a quantitative score. 
For each four-way cotiblnatlon of the six Independent variables, a mul- 
tiple correlation coefficient for each score was calculated for the 
9th and 12th grade samples. Variables nunber 19 and 27 consistently 
appeared in the conbinatlons yielding hl^ correlations . 

The simple correlation matrix shows that five other variables were 
sometimes correlated with the achievement test scores (numbers 9 , 10 , 

13, 16, 21). Stepwise regression was used el^t times (9th and 12th 
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graders by three achievenent scores and total adiievenent score) to 
choose from among these 11 independent variables. Regression coeffi- 
cients are reported, but no significance levels or standard errors are 
given. was generally hi^er In 12th grade equations. 

RESULTS 

Average class size and percentage of last year’s graduates who 
went on to college occurred aoet often# 

James Alan Thomas, Efficiency in Eduoaticm: A Study of the Eetation<^ 
ship Betoeen Selected Inputs and Mean Test Scopes in a Scarple of 
Senior High Schools^ unpubaished Ph .U dissertation (microf.), 
Stanford University Library, 1962. 

UNIT OF ANALYSIS 
School. 

SAMPLE 

206 schools in cournmnitles between 2,500 and 25,000 population. 

DATA 

School output and input data were drawn from Project TALKKT data 
bank. Data on socioeconomic diaracteristlce of home and cosssunity were 
drawn from the Census . 

VARIABLES 

Independent 

1. Slse of 12tti grade class. 

2. Median starting salary — male teadiers. 

3. Expenditure/ptipil (Grades 9-12) . 

4. Type of school (academic versus cooprehensive) . 

5. Grades Included in school (10-12, 9-12, etc.) . 

6. Kusber of da^s in school year. 

7. Average class size, science and math. 

8. Average class size, non-science. 
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9. 


Average amount of homework expected. 




10. 


Htadier of study hall periods /week. 




11. 


Number of books in school library. 




12. 


Age of bxiilding. 




13. 


Prevision for grouping. 




14. 


>lr.dian starting salary — female teachers. 


15. 


Average experience of teadicrs. 




16. 


Presence of guidance program. 




17. 


To#n population. 




18. 


Adult (in town) median years of schooling. 


19. 


Uneiig>lqyment rate. 




20. 


Percent lAor force in manufacturing. 




21. 


Median faadly income. 




22. 


Miles to nearest city larger than 100,000. 


23. 


Percent rural farm. 




24. 


Percent ciiildren in private schools. 




25. 


Percent population bom in state. 




26. 


Percent eag>loyment white collar. 




27. 


Percent owner-occupied hones. 




28. 


Quality of housing. 




29. 


Average daily percent absent. 




30. 


Delinquency rate. 




31. 


Percent dropouts after entry into 10th grade. 


32. 


Percent males went on to collegB 


last year 


Dependent 




1. 


Information teat, 10th grade, boys. 




2. 


Information test, 10th grade, girls. 




3. 


Information test, 12th grade, boys. 




4. 


Information test, 12th grade, girls. 




5. 


! English test, 10th graders, all. 




6. 


English test* 12th graders, all. 




7. 


Reading coicprehension, 10th graders. 


all. 


8. 


Reading comprehension, 12th graders. 


all. 


9. 


Creativity, 12th graders, all. 
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10. Mechanical aasonlng, 10th graders, all. 

11. Abstract reasoning, 10th graders, all. 

12. Abstract reasoning, 12th graders, all. 

13. Mathematics II, 10th graders, all. 
lA. Mathematics I, 12th graders, all. 

15. Mathematics II, 12th graders, all. 

16. Mathematics 12th graders, all. 

17. Physical science, 12th graders, beys. 

18. Mechanics, 12th graders, boys. 



PROCEDURE 

A stepeiae, multiple regression was run for each of the 18 depen- 
dent variables. AU independent variables were considered in every 

case. 

RESULTS 

R^ ranged from .77 to .87. F tests indicated very significant R 
in every case (minimum F is 8.12; maxima P, 17 .AO). In one regression 
equation (dependent variAle nusber 18) , all 32 independent variables 
were significant at the 1 percent level. (Beta coefficients were all 
at least 10 times their standard error.) Consistently significant pos- 
itive (negative) variableswore 1-A, 6, 11, 12, lA-16, 18, 21, ?.A, 25, 
27, 28, 31, 32 (5, 7-9, 29) . Consistently insignificant variables were 

10, 13, 17, 19 , 20 , 22 , 23 , 26 , 30. 

Charles Benson et al .. State and Local Fieoat Felationsh-i.p8 in Public 
Education in California, Report of the Senate Fact Finding Commiftee 
Revenue and Taxation, Senate of the State of California, Sacramento, 
March 1965. 

tmiT OF ANALYSIS 

School District. 

SAIPLE 



Fifth-grade pupils in 2A9 California school districts . 



DATA 



Data on soaoeconomic varirf>lc8 for district*! * attendance areas 
were collected from 1960 Census. Data on sdiool resources were obtained 
from district records. 

variables 

Independent 

1. District taxes/total incone. 

2. State ad/total incone. 

3. Otber ad/total incone. 

4. Total incone/ADA. 

5. Instructiona expense /tota expense. 

6. Instructiona expense/ total ADA. 

7. Tota expense/tota ADA. 

8. Percent teadiers in agjiest saaty quartile. 

9. Percent teachers in lowest saary quaraie. 

10. Percent teadhers in prowisiona saary quaraie, 

11. ADA/ teacher. 

12 . Teachers /adanistrators . 

13. Kean teachers’ saary. 

14. Mean adaastrators * saary. 

15. Teachers Vadniastrators* salary. 

16. Teachers* salary /ADA. 

17. Adniastrators’ salary /ADA. 

18. Median household incone. 

19. Median adats* education. 

20. Unesploynent rate. 

21 . Percent persons under 18 living with both parents. 

22. ADA. 

23. Sixe of attendance area. 

24. Assessed value/ADA. 

25 . Tax rate . 



Dependent 

Score on reading achieveisent test. 
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PRQCEDUM 

Th. »«. divided by .lee of dUtrlct (In ADA) Into three enb- 

,«.ple.. After preU«ln.r, ln.pectlon of .Inple correlation. , ten In- 

^ / A A 9 11-lA. 18, 22) were included 

dependent variables (numbers A, 6, 8, 9, 11 !*♦, . 

in a stepwise regression for each aubsanple. 

RESULTS 

For the .aaUeet site category, Independent verlAles 6, 8, 18, 
md 22 vere elgnlflcmt «>d po.ltive. Independent vorlAle. 9 ani 12 
were significant and negative# 

For the Blddle-elted di.ttlcte. Independent verlAles 8, 13, u>d 
18 *ere both poeltlve «d .lgnlflc«.t. VerlAle. 12 and 22 vere nega- 

cive and significant. 

For the largest districts, independent variables 8 (-) , 12 (+) , 
and 18 (+) were significant. 

Jeoe. S. Cole»m etjl. , "PupH Achlevenent ani Motivation," Chapter 3, 
Equality of Siueatioml Oppoitmity, 0.8. Departnent of Health, 
Education, end Welfare, O.S. Office of Education, OK-38001, 
Washington, D.C., 1966, pp. 218-333. 

unit of analysis 

Individual/School. 

SAMPLE 

6A5,000 students in the 1st, 3rd, 6th, 9th, and l2th grades in 
about 3100 schools. 



O 

ERIC 



1 



*The Coleman report la a naselve docuaent preaentlng the reaulta 
of re.e«ch Into a nSber of educational prbble-i. We are concerned 
here only with that aegoent of the report that deala with the rel a- 
tlonahlp between achool reaourcea and background factor, and atudent 
outcomes . 
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DATA 



1170 hl^ .chooU »te raadoBly chosen olthln a atraciflcd sampling 
sche«. Every elementsty sdiool that sent over 90 percent of its gradu- 
ates to a selected secondary echool was included in the sample of elemen- 
tary schooU. The remainder of the elementary school sample was selected 
from other feeder schools by a stratified, random process. The total 
elementary achool sample contained 3223 schools. Sd.ool resources were 
derived from questlonneltes appUed to school superintendents, princi- 
pals, and teachers. Background factors were drawn from questionnaires 
applied to Individual students . Student outcomes were obtained from a 
battery of tests administered order BIS direction. Both principal and 
pupil questionnaires were obtained from 689 hirfi schools. 

VARIABLES 

Independent 

1. Retidlng naterlal in aoine. 

2. Poesesslons in honiee 

3. Parents* education. 

4. Mt 0 i>er of siblings. 

5. Parents* educational desires. 

6. Parents* interest. 

7. Integrity of home. 

8. Changing schools, 

9 . Foreign language . in home . 

10. Urbanism of backgrowid. 

11. Control of environment. 

12. Self concept. 

13. Interest in school. 

14. Homework. 

15. Preschool. 

16. N»anbcr of students in school in grade. 

17. Nonverbal mean score. 

18. Verbal mean score. 

19. Proportion Negro in grade. 
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20. Proportion vhite in grade. 

21. Proportion Mexican-Anerican in grade. 

22. Proportion Puerto Rican in grade. 

23. Proportion Indian in grade. 

24. Proportion Oriental in grade. 

25. Proportion other in grade. 

26. Average white in clasa last year. 

27. Average white throug^iout school. 

28. Proportion definite plans for college. 

29 . Proportion awther attends college . 

30. Proportion mother wishes excellence. 

31. Proportion own encylopedia. 

32. Proportion college prep curriculum. 

33. Proportion read over 16 books. 

34 . Proportion menber debate clxib . 

35. Average number science courses. 

36. Average nvBiber language courses. 

37. Average nunber mathematics courses. 

38. Average time with counselor. 

39. Proportion teachers expect to be best. 

40. Proportion no diance for successful life. 

41. Proportion want to be best in class. 

42. Average hours homework. 

43. Teachers* perception of student quality. 

44. Teachers* perception of school qxiality. 

45. Teachers* SES level. 

46. Teachers* experience. 

47. Teachers* localism. 

48. Teachers* quality of college attended. 

49. Teachers* degree level. 

50. Teachers* professionalism. 

51. Teachers* attitude toward integration. 

52. Teachers* preference for middle-class students 

53. Teachers* preference for white students. 

54. Teachers* verbal score. 
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55. Teacher ' variation in proportion of vhite student tau^t. 

56. Teachers' proportion male. 

57. Teachers* proportion white. 

58. Teachers* proportion certified. 

59. Teachers* average salary. 

60. Teachers* ntiaber of absences. 

61. Teachers* attended institute for disadvantaged. 

62. Teachers* attended NSF institute. 

63. Pupils /teacher. 

6A. Percentage nakeshift rooms. 

65. Specialized rooms and fields. 

66. Science lab facilities. 

67. Library volumes /student. 

68. Extracurricular activities. 

69. Separate classes for special cases. 

70. Comprehensiveness of curriculum. 

71. Number of specialized teachers and correctional personnel. 

72. Transfers. 

73. Ntadber of types of tests given. 

7A. Movement between tracks. 

75. Accreditation index. 

76. D^s in session. 

77. Age of texts. 

78. Part-day attendance. 

79. Teacher turnover. 

80 . Guidance counselors . 

81. Attendance. 

82. Percent graduates who go on to college. 

83. Principal from teachers college. 

8A. Principal's salary. 

85. School location ( urban/ rural) . 

86. Length of academic d^. 

37 . Tracking. 

88. Accelerated curricultim. 

89. Promotion of slow learners. 



4M‘922 O - 72 • t3 
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90. Attitude toward integration. 

91. Inatructlonal expenditure/pupil. 

92. School board elected. 

93. Teachers examined. 

Dependent 

1. Score on nonverbal test. 

2. Score on general information test 1. 

3. Score on general information test 2. 

4. Score on general information test 3. 

5. Score on general information test 4. 

6. Score on general information test 5. 

7. Score on general information total. 

8. Score on verbal test. 

9. tfcore on reading test. 

10. Score on mathematics test. 

PROCEDURE 

I 

Sinple correlation matrices were constructed and examined. The 
60 independent variables that appeared to be most inportant were selec- 
ted and used for all grades. (At lower grades some variables were non- 
existent, reducing the total at those grades.) Preliminary regressions 
were then run and further variables deleted. Pinal analyses were con- 
ducted on 6th, 9 ch, and I2th grade samples stratified, at eadi grade 
level, by race and region (North/South) . 

In eadi case a sequence of regression runs was made in which blocks 
of variables were added to a regression and the additional explanatory 
power of each block of variables was calculated. Regression coefficients 
and tests of significance were not reported. Background factors are 
always entered prior to any of the three main categories of variables: 
student-body variables, school facilities and curriculum measures, and 
teachers* characteristics. Verbal achievement is the only dependent 
variable for which results are reported. 
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RESULTS* 

Background Factors 

Eight background factor variables (nuabers 1-7 and 10) explained 
about 15 percent and 10 percent of the variance in the achievenent of 
Southern and Northern Negroes, respectively. The explanatory power of 
background factors for Northern and Southern whites was about 20 per- 
cent in each case. 

S chool Facilities and Curriculum 

In general, aeasures of school facilities and curriculum accounted 
for an extremely small amount of variation in student achievement. 

Eleven variables (nuniers 16, 66-68. 70. 7A. 80. 85. 87-88. and 91) 
were used to measure facilities and curriculum. Instructional expen- 
ditures per student (91) accounted for less than .3 percent of the var- 
iation in achievement after the six "objective" background factors 
(1-A. 7. 10) were controlled for three of the four major subgroups. 

For Southern Negroes, this variable accounted for about 3 percent of 
the variation in achievement after background factors were entered. 

The unique contribution of the school faciUties and curriculum 
measures varied among grade levels and race/region sxi>groups. But the 
only cases where the additional explanatory power of these eleven var- 
iables (entered after the six "objective" backgromd factors) exceeded 
about 3 percent were, again. Southern Negroes. There, these variables 
generally added about 8 percent. 

Teachers* Characteristics 

Seven teacher variables (nunbers 45-A7. A9. 52. 54. and 57) were 
selected. Controlling for the six "objective" background variables, 
teachers* characteristics contributed between 1 and 2% percent explana- 
tory power for whites, about 3 percent explanatory power for Northern 
Negroes, and about 8% percent explanatory power for Soutliem Negroes. 



^Results are reported for five grade levels by ten 
racial /regional subsamples. We will concentrate on the 
9th and 12th grade results for Northern and Southern Negroes 

and whites. 
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Studant-bodv Charatteristlcs, 

Five student-body variables (utoBbers 28, 31, A2, 72, and 81) ac- 
counted for far note variation in the achievement of minorl*-y group 
children than did any attributes of school facilities and somiwhat more 
than did attributes of staff. Controlling for the six "objective" back- 
ground fec&ors and eleven school characteristics (faciUties and cur- 
riculum) , studenc-bc,y characteristics added about A percent to the 
explanatory pok»er r.f the Negro regressions and about 1>1 percent to the 
explanatory pov\?r of the white regressions. 



The variable, proportion white students in school, had a nagligible 
effect upon white achievement under all conditions. For Negroes the 
variable added to the explanatory power of an equation that includes 
the six "6bjective"background factors and instructional expenditures 
per pupil: l*i to 3 percent if no other variables are controlled, and 

a negligible amount if student-body characteristics are also controlled. 



Total Impact 

The six background factors accounted for about 13 (7%, 16, ) per- 

cent of the variance in the achievement of Southern Negroes (Northern 
Negroes, Southern whites. Northern whites). The seven teacher charac- 
teristics added rf>out 8 (3, 2J«, 1%) percent to the explanatory power 
of the equation. Adding the. eleven school variables increased the re- 
gression's explanatory power by about 3% (2, Ih, 1) percent. Finally, 
adding the student-body variables increased explanatory power by about 
2 (2, 1, 3/A) percent. Overall, then, the production function accounted 
for about 26 (15, 20, 19) percent of the variance in students' verbal 
adhievement. 



Jesse Burkhead, Thomas G. Fox, and John W. Holland, Input and Output 
in Lccrge. City High SdiooUj Syracuse University Press, Syracuse, 1967. 



UNIT OF ANALYSIS 
School. 
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SAMPLE 



(a) 39 Chicago schools, (b) 22 Atlanta schools, (c) 177 Project 

TALENT schools. 

DATA 

Chicago and Atlanta data were drawn from school district records. 
The TALENT sample was drawn from Project TALENT file. Occasional var- 
iables were drawn from the Census. 

VARIABLES 

Independent 

1. Median family income in school's attendance area (Census). 

2. ADA. 

3. Age of building. 

4. Textbook expenditures/pupil. 

5. Materials and supplies expenditure/pupil. 

6. Median teacher experience. 

7. Percent teachers with M.A. or higher. 

8. Teacher man-years /pupil. 

9. Administrator man-years /pupil. 

10. Auxiliai^r man-years /pupil. 

Dependent 

1. Percent 11th graders in ''stanines" 5-9 on IQ test/percent 
11th graders in stanines 5-9 in norm group for test. (A 
stanine is an interval along a nine-point, ten-equal-interval 
line.) 

2. Identical index calculated from a res'ding test. 

3. Percent dropouts, 11th grade. , 

4. Percent 11th graders expressing college intentions. 

5. Residual from simple regression of 11th grade IQ index 
on similarly defined index for that year's 9th graders 
in same school. 

6. Identical index calculated from a reading test. 
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Independent 

1. Total experlence/pupll. 

2. Library experlence/pupll. 

3. Average teacher salary. 

4. Enrollment/teacher. 

5. Teacher turnover. 

6. Registration beginning of year. 

7. Median family Income In school's attendance area (Census). 

8. ADA. 

9. Age of building. 

Dependent 

1. School median on verbal test, 10th graders. 

2. Percent male dropouts, all grades. 

3. Percent graduates who went on to college. 

4. Residual from simple regression of 10th grade verbal test 
median on '8th graders median score on same test that year. 

Independent 

1. Books In llbrary/12th grader. 

2. Mean class size. 

3. Beginning salary, male teachers. 

4. 12th grade enrollment. 

5. Median family income in attendance area (Census). 

6. Age of building. 

7. Median teacher experience. 

8. Total expendltures/pupll. 

Dependent 

1. 12th grade reading scores, school mean. 

2. Percent dropout, all grades. 

3. Percent graduates who went on to college. 

4. Resldu 2 il of mean 12th grade reading score regressed on , 
mean 10th grade reading score, same test, school.) year. 



PROCEDLTIE 



Stepwise multiple regression for each dependent variable. 

RESULTS 

(a) Nothing significant showed up In the IQ residual regression. 
Family Income wns significant positive In reading and IQ Index regres- 
sions; nothing else was significant. Teacher experience was significant 
positive in residual reading score regression; nothing else was signi- 
ficant* Family income » age of building (counting from oldest) y and ma- 
terial and supplies expenditures /pupil were s ignificant negative in 
dropout regression. There was nothing significant for college inten- 
tions . 

(b) There was nothing significant for post-high-school. Family 
income, library expenditures /pup 11, and average teacher salary were 
significant negative , and total expenditures /pupil and registration 
were significant positive , in dropout regression. Family income was 
significant positive and registration significant negative in verbal 
score regression. Teacher turnover was significant negative in resi- 
dual verbal score regression. 

(c) There was nothing significant in either percent dropout or 
percent college. Books in library/12 grader was significant and 
positive in residual regression. Family income, building age, teacher 
experience, and salary were significant in reading scores regression. 

Eric Hanushek, The Education of Negroes and Whites ^ unpublished Ph.D. 

dissertation (microf ), Massachusetts Institute of Technology, 1968. 

UNIT OF ANiU.YSlS 
School. 

SAMPLE 

471 schools with five or more white 6th grade students and 242 
schools with five or more black 6th grade students. All schools were 
in the Northeast or Great Lakes regions . 

■; ■ OAA 
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All dAttt wer« drawn fron the Equal Educational Opporcunlty Survey 
(EEOS). All variables vsre school aggregates acroas all 6th graders In 
school. 



VARIABLES 

Independent 



1 . 

2 . 

3 . 

A. 

5 . 

6 . 

7 . 

8 . 
9 , 

10 . 

11 . 

12 . 



Possessions In home. 

Father's education. 

Family size. 

School In central city. 

Percent Negro students . 

Teacher's experience. 

Teacher's verbal ability. 

Percent students have non-white teacher previous year. 
Percent who attended nursery school. 

Percent student out-migration previous year. 

Percent students who wish to finish high school. 

Percent students who feel they have little chance of success. 



Dependent 

Verbal score. 



PROCEDURE 



Two regressions were run in log-log form, one each for white 
schools and black schools. 



RESULTS 



In the white sanq)le all variables were significant except family 
size and student out-mJ.gration . The nursery school and out-migration 
variables were omitted from the black regression. All other variables 
entered and were significant except father's education and percent non- 
white teachers previous year. Signs were the same in both regressions , 
with possessions, father's education, nursery school, percent wishing 
to finish high school, teacher^s verbal score, and teacher''s experience 
being positive . 









IW- 



H 



Kartln T. K«cs«an^ '^sertbutlon md Production in a Big City Eloatantary 
School Syotam," Yale Eoonomio Eaeaya, Vol. 8 (taring 1968), 201-256. 

UNIT OF ANALYSIS 
School . 

SAMPLE 

56 Boston schools. 

DATA 

Obtained from local (Boston) sources. 

VARIABLES 

Independent 

1. Class size. 

2. Percent students in classes greater than 35. 

3. Students /s taff . 

4. Size of school area. 

5. Percent teachers with permanent status. 

6. Percent permanent teachers M.A. or greater. 

7. Percent permanent teachers 1-10 years experience. 

8. Percent turnover. 

9. Percent seating capacity utilized. 

10. Index of cultural advantage. 

Dependent 

1. Attendance rate. 

2. ADA percent of initial enrollment. 

3. Median score on reading test, 6th grade; ditto, 2nd grade. 

4. Percent taking Latin School test. 

5. Percent passing Latin School test. 

6. Continuation rate (100 dropout rate of alumni). 



Stepwiee, multiple regression for each of the dependent variables. 



PROCEDURE 




RESULTS 



Th« Index of cultural advantage waa algrificant and poaltlva in 
all aquae Iona except 4 and 5. Also, the aise of the adiool axa« waa alg> 
ntf leant and poaltlva In 1-3 and 6. Teacher inexperience waa signifi- 
cant and negative In 3, significant and positive In 1, 2, and 6. 

Students /staff waa significant and negative In 3. Nothing was sig- 
nificant In 4 and 5. 

Elchanan Cohn, "Economies of Scale In Iowa Hl^ School Operations," 
Journal of Human ReBOurcea, Vol. 3 (fall 1968), 422-434. 

UNIT OF ANALYSIS 

School district. 

SAMPLE 

377 Iowa high school districts, of which 372 are onc-school dis- 
tricts . 

DATA 

Provided by the Iowa State Department of Ptibllc Instruction. 

VARIABLES 

Independent 

1. Average number college semester hours /teaching assignment. 

2. Average number different teaching assignments /teacher . 

3. Median high school teacher's salary. 

4. Number of credit units offered (1 unit =» 1 course 1 year) . 

5. Building value/pupil. 

6. Bonded Indebtedness /pupil. 

7. Class size (number pupils/number teachers). 

8. ADA. 

Dependent 

Average composite score on the Iowa Tests of Educational Develop- 
ment administered to 12th graders in 1963 less the average composite 
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•cor« on th« »«aMi battery a4nln.1 stared to 10th gradera In 1961. No 
correctlcit for studant~body chang’ia. 

PKOCEDURE 

Multiple rcgreaslon. 

^.SULTS 

Indepent variables 1 and 2 (3 and 4) were significant and negative 
(positive). Transforming all variables into logs and rerunning yielded 
the same result. When the sample was restricted to 87 districts whose 
I960 population exceeded 5,000, only variable 2 was significant (it was 
still negative) . 

Richard Raymond, "Determinants of the Quality of Primary and Secondary 
Public Education In West Virginia," Journal of Human Resources, 
Vol. 3, No. 4 (fall 1968), pp. 450-469. 

UNIT OF ANALYSIS 

School district. 

SAMPLE 

Approximately 5000 students entering West Virginia University (WVU) 
between Septenber 1963 and September 1966 from 49 West Virginia county 
school districts. 



DATA 

Outcome data were obtained from the University. Data on school re- 
sources were obtained from various state agencies. Background factors 
were derived from the Census. 



VARIABLES 



Independent 



1. , Average teacher’s starting salary wei^ted. 

2. Average teacher's salary. 

3. Average elementary teacher's salary. 

<-'2m 
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4. Average secondary teacher's salary. 

5. Average vel^ted (by degree level) teacher's salary in 
contiguous counties. 

6. Average teacher's salary In contiguous counties. 

7. Percent teachers te'^chlng in two or more fields. 

8. Students/teacher. 

9. Number of library volumes in excess of standard. 

10. Non- teaching expendlture/pupll. 

11. Median Income of professional, managerial, and kindred 
occupations in county. 

12. Median family Income in county. 

13. Median years of schooling by adults In county. 

14. Urbanization of county. 

15. Percent employed In professional level occupations In county. 

Dependent 

1. Mean grade point average In freshman year at WVU for sampled 
students In county minus the county quality measure computed 
from grade point averages (see procedure, below). 

2. Mean composite ACT score for sampled students In county who 
went to VATJ minus the county quality measure computed from 
achievement test (ACT) composite score (see procedure, below). 



PROCEDURE 

The set of students from each county who go on to WVU is not a 
random sample of all high school graduates from that county. To con- 
trol for this, two quality measures were defined. A stratified, random 
10 percent sample of, the students who did go on to WVU was chosen. 

Their grade point average in freshman year at WVU was regressed on their 
grade point average in selected high school subjects. (The regression 
was not forced through the origin.) Then the difference between each 
student's freshman-year GPA and his selected-high— school-sub jects-GPA 
times the regression coefficient on high-school-GPA was calculated. 

The value of this calculated variable, averaged over all students in 
a county, was taken to be the GPA quality index for that county school 
district. 

MO 
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Th« ACT quality Index was calculated for each county by an identi- 
cal procedure f using the ACT composite scores of the students in the 
subsiunple as the dependent variable in the simple regression. 

your regressions were run for each dependent variable. Independent 
variables 7 through 15 enter every regression. Independent variables 
1 through 4 each enter one regression for each dependent variable. In- 
dependent variable 5 enters the two regressions with Independent varia- 
ble 1. Independent variable 6 enters the six regressions with Indepen- 
dent variables 2 through A. 



None of the Independent variables k through 15 was ever significant. 
Independent variable 1 was significant when the second dependent variable 
was used) but not when the first dependent variable was used. Indepen- 
dent variables 2 and 3 were each significant in their two (each) re- 
gressions . 

Samuel Bowles, Educational ’Production Function^ Final Report, U.S. De- 
partment of Health, Education, and Welfare, U.S. Office of Educa- 
tion, OEC-1-7-OOOA51-2651, ED 037 590, Harvard University, Cambridge, 
Massachusetts, February 1969. 

UNIT OF ANALYSIS 

Individtjal/s chool . 



(a) Black male high school seniors in U.S. Office of Education 
regions 1, 2, and 3 in 1960 who responded to both the initial and 5-year 
follow-up Project TALENT surveys, (b) EEOS data on black students en- 
rolled in. the fifth grade in 1965. 



Drawn from TALENT and EEOS data banks. Background factors on in- 
dividual level, school resources on school level, in both data banks. 



RESULTS 



SAMPLE 



DATA 
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VARIABLES 

Independent 

1. Father's occupation. 

2. Mother's occupation. 

3. Father's education. 

A. Mother's education. 

5. Own room, desk, typewriter. 

6 . Appliances . 

7. TV, telephone, radio, phonograph. 

8. With whom living. 

9. Average class size, science and math. 

10. Senior class size. 

PROCEDURE 

(a) In order to maximize observations, regression coefficients 

A 

were estimated from the relationship cov Xj) b = cov (x^j^, y) , 

where the Ijth element or cov (x^^, Xj) Is calculated on the basis of 
all observations for which data on 1 and j are available, and similarly 
for cov (x^, y) . Separate "regressions" were run for each dependent 
variable and beta coefficient calculated. Beta coefficients for social 
class variables were summed, as were those for school variables. The 
respective sxnns were compared in each case to estimate the relative im- 
portance of each set of variables with respect to each dependent varia- 
ble. Bowles apparently (it is never stated one way or the other) "fitted" 
his equations, deleted insignificant variables, then "refitted" the 
equations . 

(b) Essentially the same steps were repeated using EEOS data. 

Bowles then examined the specification bias stemming from the 
omission of initial endowments . 



peared in all three equations with positive signs. Mother's education 
and mother's occupation appeared once each; both were positive. The 



RESULTS 



(a) Father s occupation and the measure of consumer durables ap— 
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siims of the beta coefficients for social class variables were .62, .46, 
and .69 in the reading, mathematics, and composite score equations, 
respectively. Teachers with graduate training/class appeared in all 
three (positive) ; class size in science and mathematics appeared twice 
(negative) , but not in the mathematics equation. Tracking was negative 
in all three. Expenditure per student on non-teaching inputs (positive) , 
age of building (negative) , and educational innovation entered once 
each. The sums of the beta coefficients for school variables were 
.35, .80, and .47, in order. 

When percent black was added to each equation, it was significant 
(negative) in two cases (except reading) . 

(b) Reading material in home, nuniber of siblings, parents’ edu- 
cation level, teachers’ verbal ability, and presence of science labor- 
atory facilities, average time spent in guidance, and days in session 
were all significant and positive. Regarding days in session as a 
community variable, the sum of the beta coefficients for school inputs 
was .32, very similar to the sum of srhool input beta coefficients in 
the TALENT reading equation. 

Bowles then introduced student's control of environment and stu- 
dent's self-concept. Both were positive and very significant, 

Thomas G. Fox, "School System Resource Use in Production of Interdepen- 
dent Educational Outputs" (mimeo) , The Joint National Meeting ^ 
American Antronautiaal Society and Operations Research Society , 
Denver, Colorado, 1969. 

UNIT OF ANALYSIS 
School. 

SA MPLE 

39 Chicago schools . 

DATA 



Chicago school district records and the Census . 
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VARIABLES 



Independent 



1. Teacher man-years. 

2. Auxiliary service man-years. 

3. Total book expenditures (text and library) . 

4. Index of building utilization capacity. 

5. Capacity of building, weighted by age. 

6. Percent student class hours in vocational courses, weighted 
by number of students. 

7. Median family Income In attendance area, weighted by number 
of students . 

8. Percent of students planning on college, weighted by number 
of students. 

9. Number students employed part-time. 

Dependent 

1. Eleventh grade median reading stanlne weighted by number of 



2. Holding power (one minus dropout rate) . 

PROCEDURE 

Two simultaneous equations were specified In double log form, one 
for each dependent variable. Each dependent variable enters the other 
dependent variable's equation as an independent variable. (Tlie theory 
Is that schools trade off between the two outputs.) Independent vari- 
able 8(9) was deleted from the holding-power (reading) equation. Two- 
stage least squares (TSLS) was used to estimate the simultaneous system. 
Independent variable 4 was deleted (Insignificant) , and the reduced 
forms were calculated and estimated using ordinary least squares. 

RESUL TS 

Holding power (positive) and total teacher man-years , total text 
and library book expenditures, and vocational class student hours (all 
negative) were significant In the TSLS equation for reading. Family 
Income had a t-ratlo below one. Ifi the holding-power equation, reading, 



students 



ERIC 
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total teacher man-years, total book expenditures, and vocational class 
student hours were all positive and significant. Total auxiliary man- 
years, building capacity weighted by age, and total family income were 
all negative and significant. No significance statistics were presented 
for the reduced-form equations. All variables were positive except book 
expenditures and building capacity-age code in the reduced— form reading 
equation. Only total students employed part-time was negative in the 
holding-power reduced- form equation. 

Herbert J. Kies ling. The Relationship of School Inputs to Publio School 

Performance in New York State y P-A211, The Rand Corporation, October 
1969. 

UNIT OF ANALYSIS 

School District. 



SAMPLE 

97 school districts. 



DATA 

The dependent variable is the average for all 6th grade pupils who 
were in the same school and took the same test in the 4th grade. School 
resource and family background measures were drawn from district records. 



VARIABLES 



Independent 



1. Teachers /pup il . 

2. Principals and supervisors/pupil. 

3. Special staff/pupil. 

4. Expenditures/pupil for books and supplies. 

5. Median teacher salary. 

6. Average salary of teachers in top salary decile. 

7. Index of occupation of family breadwinner of 5th grade pupils. 

8. School district debt/pupil. 

9. School district growth rate, 1950-1958, 



454*922 0 - 72-14 
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10 . ADA . 

11. School p.'operty value/pupil. 

12. Salary of superintendent of schools. 

13. Mean salary of principals. 

14. Expenditures /pupil for principals, assistant principals, and 
supervisors . 

15. School district value of buildings /class room. 

16. School district value of furniture and equipment/classroom. 

17. Median years teacher experience in school district. 

Dependent 

1. Composite score on Iowa test of basic skills. 

2. Arithmetic score on Iowa. 

3. Language score on Iowa. 

Three variants of each dependent variable were used: 

a. School district mean for 6th graders who were in the same 

school and took the same test In the 4th grade. 

b. School district gain at the mean — 4th grade to 6th grade 

(for all students present In both the 4th and the 6th grade, 
school district in 6th grade less school district mean in 
^itli grade). 

c. School district mean for 6th graders who were present in the 
4th grade with those pupils* mean score In the 4th grade en- 
tered as an Independent variable. 

The sample was stratified into five subsamples on the basis of the 
family breadwinner's occupation. For each subsample and for the total 
sample, the nine dependent variables were computed (i.e., averaging 
over the pupils In each subsample) . Thus the study Included 54 depen- 
dent variables. Districts were then divided Into two groups — urban 
and non-urban — and the 54 regressions were run for each group, 

PROCEDURE 

Factor analysis, a priori reasoning, and Inspection of simple cor- 
relations \fere used to reduce the list of independent variables to six', 
index of occupation, teachers /pupil, expenditures /pupil for books and 
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supplies, average salary of teachers in top salary decile, value of 
school district property/ pupil, and expenditures/pupil for principals 
and supervisors • 108 regressions were then run. 

RESULTS 
Rural Sample 

In the 54 regressions run on the sample of rural districts, only 
the occupation index was ever significant. 

Urban Sample 

The author claims (regression results are reported for 30 of the 
54 regressions) that; (1) there are major differences in findings among 
the three varl.ants of each dependent variable; (2) findings for all 
three test scores are basically the same; (3) the index of occupation 
is always positive and significant; (4) teachers/pupil and expenditures/ 
pupil for books and supplies consistently related negatively to the de- 
pendent variable., often at an advanced level of significance (in the 
30 reported regressions the teacher-pupil ratio was significant in 
12 casvss and expenditures per pupil was significant in 10 cases) ; and 
(5) none of the other three variables was uniformly important, although 
each was important at one time or another. (In 7 of the 30 reported 
regressions, none of the other three school variables was significant.) 

Herbert Riesling, A Study of Cost and Quality of Mw York School Dietricts^ 
U.S. Departnent of Health, Education, and Welfare , U.S. Office of 
Education, 8—0264, Washington, D.C., February 1970. 

UNIT OF ANALYSIS 

School District. 

SAMPL E 

Fifth and 8th grade pupils in 86 school districts in New York 
State. Eighth, and in some cases 5th, grade students in 273 schools 
in New york state. 
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Ninety-nine school districts were chosen from among New York school 
districts that used the Iowa Test of Basic Skills in the 5th and 8th 
grades in the 1964-1965 school year. Usable information was obtained 
from 86 of them. Test scores and data concerning parents’ occupations 
and education were obtained from the districts . School resources data 
were obtained from New York’s Basic Educational Data System, which be- 
gan collecting detailed data on New York schools in 1967. 

The selection of schools for the second part of the analysis is 
not described. 

VARIABLES 

Independent 

1. Average teacher salary. 

2. Average teacher experience. 

3. Average teacher degree level. 

4. Average teacher certification. 

5. Pupils /classroom. 

6. Pupils /laboratory . 

7. Pupils /academic classroom. 

8. Value of school-district-owned property /pupil . 

9. Average salary of non-classroom professionals . 

10. Principal’s experience 

11. Principal's degree level. 

12. Father's education level. 

13. Mother's education level. 

14. Father's occupation level. 

15. Pupils /teacher. 

16. Expenditures/pupil on central administration. 

17. Principals and supervisors /pupil 

Dependent 

1. Score on Iowa mathematics test. 

2. Score on Iowa verbal test. 

3. Composite score. 
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PROCEDURE 



All school variables were averages over the school district. For 
each of the three dependent variables at each grade level (5th and 8th), 
the sample was divided into seven oubsamples on the basis of the father's 
education. The dependent variable for each subsample was computed as 
the average score over all students in a district in the subsample. The 
design thus generated 42 regressions; but sample sizes were too small to 
support analysiis in three cases. The sample was then restratified by 
seven categories of the father's occupation and the procedure was re- 
peated. All 42 regressions were run. The independent variables in all 
81 equations were mother's educational level (district average over all 
students in subsample), average (over entire district); teacher's sal- 
ary, experience, degree level, and certification; pupils/teacher ratio; 
and expenditures/pupil on central administration. 



An alteimative model was formulated in which administrative expen- 
ditures/pupil was dropped from the regressions and the value of school- 
district-owned property and the number of pupils and supervisors (both 
on a per-pupil basis) were inserted. This model was run on the seven 
stratif ied-by-occupation 5th grade subsamples and the six stratified- 
by-education 8th grade subsamples. The composite score, averaged over 
the subsample, was the dependent variable in all cases. 



A factor analysis of the independent variables suggested another 
alternative specification of the model. Mother's education level, aver- 
age teacher's degree level, experience, and salary, the pupils /teacher 
ratio, administrative expenditures per student, and pupils/classroom 
ratio were the independent variables. Composite scores for six 5th 
grade and the seven 8th grade stratifled-by-education subsamples were 
the dependent variables. 



The sample of school districts was divided on the basis of popu- 
lation density into two groups — urban and rural. The original model 
(independent variables 1-4, 13, 15, and 16) was fitted for seven 5th 
grade urban district and six 5th grade rural district stratified-by- 
educatlon stibsamples. The dependent variables were average composite 
scores over all pupils in the subsamples. 

. ■ M 
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The sample of districts was divided into two groups -- vrithin and 
outside the Standard Metropolitan Statistical Area. For the districts 
in each group, seven stratified-by-education regressions were run. The 
original set of independent variables (1-4, 13, 15, and 16) was used. 

The dependent variables were average composite scores for all 5t:h graders 
in each subsample. 

Finally, data were collected on the level of the individual school. 
Eight subsamples were defined: schools in districts with six or more 
schools, schools in districts with five or fewer schools, all schools 
in Albany, Birmingham, Niagara Falls, Schenectady, and Syracuse, and 
all schools. The dependent variables were not defined. 

RESULTS 

The various stratification schemes add up to 127 regressions. The 
box-score is : 



Variable 


No. of Regres- 
sions in Which 
Variable is 
Entered 


No . of Regres- 
sions in Which 
Variable is 
Significant 
with Positive 
Sign 


No. of Regres- 
sions in Which 
Variable is 
Significant 
with Negative 
Sign 


Mother's education level 


127 




48 


Pupils /classroom 


6 




1 


Teacher certification level 


. 121 


28 


1 


Teacher degree level 


127 


4 


4 


Teacher experience 


127 


18 


4 


Teacher salary 


127 


1 


8 

\ 3 


Pupils /teacher 


127 


3 


Administrative expenditures/ 
pupil 


114 


36 




Value of school-district 
property/pupil 


13 




2 


Administrative personnel/ 
pupil 


13 
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Henry M. Levin, "A New Model of School Effectiveness," Do Teachers Make 
a Difference? y U.S. Department of Health, Education, and Welfare, 
U.S. Office of Education, Bureau of Educational Personnel Develop- 
ment, OE-58042, 1970, pp. 55-75. 

UNIT OF ANALYSIS 

School/individual . 

SAMPLE 

597 white, 6th grade students in 36 schools in a large Eastern 
city who had attended no other school. 

DATA 

All data drawn from EEOS. School resources measured on school 
level; background factors measured on individual level. 

\ 

V ariables 

Independent 

1. Sex. 

2 . Age . 

3. Possessions in student’s home. 

4. Family size. 

5. Real (or surrogate) mother in home. 

6 . Real (or surrogate) father in home . 

7. Father’s education. 

8. Mother’s employment status. 

9. Attended kindergarten. 

10. Teacher’s verbal score. 

11. Teacher’s parents’ Income. 

12. Teacher experience. 

13. Whether teacher’s undergraduate institution university or 
college . 

14. Teacher’s satisfaction with present school. 

15. Percent white students. 

16. Teacher turnover. 

17. Library volumes per student. 
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Dependent 

1. Student's attitude. 

2 . Parents ' attitude . 

3. Student's grade aspiration. 

4. Student's verbal score. 



PROCEDURE 

One equation was specified for each of the dependent variables. The 
verbal score equation Included the other three dependent variables and 
all Independent variables except 5, 6, 8, 11, and 15. The student's 
attitude equation Included verbal score and parents ’ attitude and In- 
dependent variables 1-4, 6-8, 14, 16, and 17. The grade aspiration 
equation Included verbal score and parents' attitude and all Indepen- 
dent variables except 7, 10, 12, and 15-17. The parents' attitude equa- 
tion Included Independent variables 1, 3-6, 8, 15, and 16. Ordinary 
least squares (OLS) , two-stage least squares (TSLS) , and reduced-form | 
estimates (RFE) were calculated for dependent variables 1, 3, and 4. 

OLS was used for the parents' attitudes equation. 

RESULTS 
Verbal Score 

Student's attitude, parents' attitude, and grade aspiration were 
all significantly and positively related to verbal score In the OLS 
estimation. All were Insignificant when TSLS was used. Age and family 
size (both negative) as well as possessions, father's education, 
teacher experience, and teacher's undergraduate Institution (all posi- 
tive) were significant In the OLS estimates. Only age (negative) and 
teacher experience (positive) were significant In the TSLS estimates. 

Student Attitude 

Verbal score, attended kindergarten, teacher's satisfaction (posi- 
tive), and mother In home (negative) were significant in OLS estimates, 
TSLS yielded the same results except that mother's employment was also 
significant (positive). 







I 
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Parents* Attitude 

Possessions (positive) and family size, mother in home, and per- 
cent white students (all negative) were significant. 

Stephan Michelson, "The Association of Teacher Resourcefulness with 

Children’s Characteristics," Do Teachers Make a Difference? ^ U.S. 
Department of Health, Education, and Welfare, U.S. Office of 
Education, Bureau of Educational Personnel Development, OE-58042 , 
1970, pp. 120-168. 

UNIT OF ANALYSIS 

School/individual . 

SAMPLE 

597 white and 458 black 6th grade students in an unknown number of 
schools in a large Eastern city who had attended no other school. 



All data wer'° drawn from EEOS. School resources were measured on. 
the school level. Teacher's attributes were averaged over teachers in 
the 3rd, 4th, and 5th grades. Background factors were measured on the 
individual level. 

VARIABLES 

Independent 

1. Sex. 

2 . Age . 

3. Family size. 

4. Possessions in student's home. 

5. Father's education. 

6. Attended kindergarten. 

7. Real (or surrogate) mother in home. 

8t Teacher's verbal ability. 

9. Teacher's experience. 

* 

10, Teacher tenure. 



DATA 
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11. Discrepancy between teachers reported and desired percentage 
of white students. 

12. Teacher's desired percentage of white students. 

13. Whether teacher was academic major in college. 

14. Whether school tracks (by ability groups). 

15. Library books. 

16. Whether school has auditorium, cafeteria, gymnasium, 

17. Percent students in upper quartiles on test. 

18. Size of school site (acres). 

19-22 Four interaction terms crossing student socioeconomic status 
(SES) , student-body SES, and level of school resources — 
the constructions of these variables are not clearly defined. 

23. Father's occupation. 

24. Mother's education. 

25. Percent teachers whit**. . 

26. Teacher 's parents ' education. 

27. Teacher's year of schooling. 

28. Whether school has adeqviate texts ("adequate" not defined). 

29. Age of school building. 

30. Assignment (?). 

31. Whether mother employed. 

32. Teacher turnover. 

33. Teacher's preference for another school. 

Dependent 

1. Verbal test score. 

2. Reading test score. 

3. Mathematics test score. 

4. Student's attitude. 

5. Student's grade aspiration. 

PROCEDURE 

The sample was stratified by race, and seven regressions — two 
each for dependent variables 1 and 3, three for dependent variable 2 -- 
were run for whites. Five regressions were run for blacks, two each 
for dependent variables 1 and 2 and one for dependent variable 3. 



ERIC 




Although there Is considerable overlap, the set of independent variables 
entered into each regression differs between regressions within and be- 
tween subsamples. A system of three simultaneous equations with the 
dependent variables verbal test score, student's attitude, and student's 
grade aspiration was specified and estimated using two-stage least 
squares for each of the racial subsamples. Specifications were the 
same in both cases. 

R ESULTS 

Whites - Single Equation Model 

Sex entered all seven white regressions and was significant In 
five of them. Its sign was positive when reading was the dependent 
variable, and negative when mathemaCics-was^the^^^endent variable. 

Age and family size were both significant and negativS~^~ea<A of the 
seven white regressions. Possessions and father’s education were both ^ 
significant and positive in each of the seven. 

The remaining variables entered In each of the seven regressions 



(and their signs, if significant) were as follows : 

Verbal l: 6(+) , 9(+) , ll(-) , 15(+) , 16(+) , 19(-) , 20 , 21 , 




respect to reading 2. significant and negative in all five cases 

Possessions also entered all five regressions , but was significant (+) 

only in the two verbal equations . Family size entered the two verbal 








C 
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The other variables (and their signs, if significant) in each re- 
gression were as follows: 



Verbal 1: 
Verbal 2: 
Reading 1: 
Reading 2 : 
Mathematics 1: 



lO(-), 14(-), 24(+), 26(+), 28(+). 

14(-), 15, 24(+), 26(+), 28(+), 29(-) . 
5(+), 6(+), 7(-), 8, 13(-), 20(+), 23, 26. 
6(+), 7 (-), 13(-), 20(+), 23, 25, 26(+) . 
5(+), 23, 26, 27, 30(+). 



Whites - Simultaneous Equation Model 

The three dependent variables In the model were student's verbal 
test score, attitude, and grade aspiration. Attitude and grade aspir- 
ation were not significant in the verbal equation. Verbal score was 
significant and positive In the attitude and aspiration equations. In 
dependent variables entered (and their sign, if significant) were as 
follows: 

Verbal: 1, 2(-) , 3, 4, 5, 6,8, 9(+), 13(+) , 15, 32. 

Attitude: 1(+), 2, 3(-) , 4, 5(+), 6, 7, 32(-), 33. 

Aspiration: 1, ^ 3, 4, 5, 6, 7(-), 13, 31(+) , 33(+). 

Blacks - Simultfmeous Equations Model 

No measures of statistical significance were reported. 

Eric Hanushek, The VolIvb of Teachers in Teaching, RM-6362-CC/RC, 

The Rand Corporation, Decenber 1970 . 

UNIT OF ANALYSIS 
Individual . 

SAMPLE 

1,061 3rd girade students in a large California school system. 
DATA ■ 

Student information was de^jrived from cumulative school records, 
information on their teachers f;com a survey. 




VARIABLES 



Independent 

1. A student's 3rd grade teacher's experience . 

2. A student's 3rd grade teacher's semester hours of graduate 
work . 

3. A student's 2nd grade teacher's experience. 

4. A student's 2nd grade teacher's semester hours of graduate 
work. 

5 . Sex . 

6. Family income. 

7. Number of siblings. 

8. Number of absences. 

9. Percent Mexican-Americans in school. 

10 . Average Income in school . 

11. Student's score, on Stanford Achievement Test in 1st grade. 

12. Student's score on Stanford Achievement Test in 2nd grade. 

13. Whether student repeated grade. 

14. Percent of time 3rd grade teacher spends on discipline. 

15. Third grade teacher's verbal facility. 

16. Years since most recent educational experience, 3rd grade 
teacher. 

17. Second grade teacher's verbal facility. 

18. Years since most recent educational experience, 2nd grade 
teacher. 

19 . Whether father in clerical job ; 

20. Years of experience teaching students of this SES level, 
3rd grade teacher . 

21. Years of experience teaching students of this SES level , 
3rd grade teacher . - 
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Dependent 

1. Student's SAT score in 3rd grade. 

2. Student's SAT score In 2nd grade. 



PROCEDURE 

The records of all children In the 3rd grade in the system (2,445 
students) were examined. Data on all independent variables were avail- 
able for 1,061 students. Individual teachers were matched to indivi- 
dual students. This sample was divided into three sid>samples: 323 

whites whose fathers had nonmanual jobs, 515 whites whose fathers had 
manual jobs, 140 Mexlcan-Amerlcans whose fathers had manual jobs. A 
separate regression was run for each sijbsample and for an "all-whites" 
subsample, the first dependent variable being regressed against the 
first 11 independent variables. 

if 

Then, for each of the three subsamples, 3rd grade SAT score was 
regressed on sex, 2nd grade SAT score, and a series of dummy variables 
T where T ■ 1 if the 1th student has the jth teacher in tlie 3rd 

grade. These analyses were then repeated with 2nd grade SAT score, 
sex, 1st grade SAX score, and the teacher dummy variables as the de- 
pendent and three independent variables, respectively. 

Last, stepwise regressions were run for the two white subsamples. 
The dependent variable in each case was 3rd grade SAT score. The com- 
plete set of Independent variables considered is not given. The re- 
ported results list: for the white manual stbsample, sex, 1st grade 

SAT score, and independent variables 13-18; and for the white nonmanual 
subsample, 1st grade SAT score and Independent variables 16, 18-21. 

The author states that rejected variables include: ? school composition 
in terms of occupational distribution, 'ethnic distribution, and. achieve 
ment distribution; objective background characteristics of the teachers 
such as sbclceconomlc status, college major, and menbershlp in profes- 
sional organizations; and various measures of a teacher's attitudes 
toward his students. 
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RESULTS 



[ 17 ] 



The complete results of the first set of regressions mentioned 
above are not given. However, the teacher variables — 3rd grade 
teacher's experience and advanced training, and 2nd grade teacher’s 
experience and advance training — were all insignificant in each of 
the four regressions. 

For whites the hypothesis that the teacher dummy variables are 
identical was rejected at the 1 percent level. This was true for both 
the 2nd and the 3rd grade regressions and for both the manual and the 
nonmanual siibsamples. However, for Mexican-Americans , the hypothesis 
that all teachers had an Identical effect could not be rejected at 
either the 2nd or 3rd grade level. 

In the last set of regressions, sex (+) , whether grade repeated 
(-) , 1st grade SAT score (+) , time spent on discipline (-) , 3rd grade 
teacher's verbal facility (+) , and 2nd grade teacher's years since ed- 
ucational experience (-) were significant for the white manual subsample. 
For the white nonmanual subsample only 1st grade SAT score (+) and 
whether father had clerical job (- if yes) were significant. 



Harvey Averch and Herbert Riesling, The Eelationship of School and En- 
vivonment to Student Performance: Some SimultaneouB Models for 

the Project TALENT High Schools , unpublished paper. The Rand 
Corporation, 1970. 



UNIT OF ANALYSIS 



a. School. 

b. School/individual. 



SAMPLE 






a. About 5000 9th graders from 746 piib lie comprehensive and 
college preparatory high schools . 

b . 820 9th graders randomly ^osen from the above group . 



! ; 
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DATA 

All data were derived from project TALENT, School variables were 
on the school level. In the first part of the analysis. Individual 
achievement scores were averaged by. school. In the second part. Indi- 
vidual scores were the output measure. 

VARIABLES 

Independent 

1. Socioeconomic Index. 

2. Perceived needs of staff. 

3. Percent students to juvenile court. 

4. Principal's degree level. 

5. Number of tracks. 

6. Average class size. 

7. Percent teachers certified. 

8. Average salary, male teachers. 

Dependent 

1. Percent teacher transfers. 

2. Expected education. 

3. Student achievement. 

PROCEDURE 

In the first part, a set of three simultaneous equations (one for 
each dependent variable) were estimated by two-stage least squares tech- 
niques. Expected education, student achievement, percent students to 
juvenile court, and perceived needs of staff entered the teacher trans- 
fers equation as Independent variables. Student achievement, socio- 
economic Index, and percent teacher transfers were the Independent varia- 
bles In the expected education equation. Independent variables 4 through 
8, expected education, and percent teacher transfers were used to ex- 
plain student achievement. Reduced-form estimates were conq>uted and 
compared with ordinary least squares estimates. 



i 

'I 
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Basically the same model, for student achievement and expected 
education was then estimated on the level of the individual student. 

The percent teacher transfers equation was dropped. The results were 
compared with Levi-n’s. 

RESULTS 

School Level System 

School averagv®. student achievement was found to be significantly 
relat ed to expected education (+) , percent teacher transfers (-) , male, 
teacher’s average salary (+) , and the nunber of tracks in the school. 
Expected education was significantly related only to the socioeconomic 
index (+) . Percent teacher transfers was significantly related to stu- 
dent achievement (-) , expected education (+) , and percent students to 
juvenile court. 

In the ordinary least squares version of the student achievement 
and percent teacher transfer equations, all independent variables were 
entered into each. Socioeconomic index (+) , class size (-) , male 
teachers* average salary (+) , and nunber of tracks (+) were signifi— 
csntly related to student achievement. Percent students to Jijvenile 
court (+) and class size (+) were significantly related to percent 
teacher transfers. Again, only the socioeconomic index was signifi- 
cantly related to expected education. 

Individual Level System 

Socioeconomic index (+) was the only significant predictor of 
educational expectations on the individual level. Expected education 
(+) and average salary of male teachers (+) were significantly re- 
lated to student achievements. 

Marshall S. Smith, Equality of Edu<XLtiondl Opportunity: The Basic 

Findings Reconsidered^ unpublished paper. Center for Educational 
Policy Research, Harvard Graduate School of Education, Cambridge, 
Massachusetts, 1971. 
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UNIT OF ANALYSIS 

Individual/Fchoc’l . 

SAMPLE 

Northern 6th, 9th iuid 12th grade subsanples £rom EEOS 

DATA 

EEOS. 



VARIABLES 



Independent 



1. Urbanism of background. 



2 . 

3. 

A. 

5. 

6 . 

7. 

8 . 
9. 

% 

11 . 

12 . 



Parents' education. 

Integrity of family. 

Family size. 

Possessions. 

Reading material In home . 

Parents' Interests. 

Parents' educational desires. 

Proportion own encyclopedia. ; 

Student transfers. : ^ 

Attendance. 

Proportion In college preparatory currlculton. 

13. Average hours homework. 

14. Teacher perception of student's quality . 

Ins tructlonal expehdltures/piupil . 

Lib rary volumes/pupil . 

Science laboratory facilities. 



ns. 

16. 

17. 

18. 

J*'* 'L'': . 

19. 



Extracurrlculat, activities . r 
Accelerated curriculum. . . 



,20 .ft .Conqpri^enslyenesjs .of ciirzl.culi^ . 
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24. 

25. 

26. 

27. 

28. 

29. 

30. 

31. 

32. Teacher's localism. 

33. Proportion teacher white. 

34. Proportion students white. 

Dependent - 

Verbal scores 

; • ‘ 

PROCEDURE 

This 1 b a reanalysis of the Col enan report, (see item 4 , above) In 
the light of various erron, omission)^, wd controversial techniques 
alleged to be present in the original analysis. 

RESULTS .w 



School location. 

Guidance counselor. 

Promotion of slow learners. 

Teacher's verbal achievement. 

Teacher's degree level. 

Teacher's socioeconomic status. 

Teacher's preference for middle-class student. 
Teacher's experience. 



Errors and Omissions 

Smith ar^eSc that Coleman and his colleagues made two. mechanical 
errors in creating their tables Flrstr .two -measiires of home ba^ground -- 
parents' ;educatlon cmd ^t^banlsm of background • — were inadvertently re- ' , 

placed in the analysis. Second, the student body comp os Itlpn variable 
called proportion planning to attend college is really a measure of the 



proportion of students in the coliege trade in the school. Ftirtlier, 

ColeBaaiT ^ al . i^e an error in tlieir procedure for es timating the amount of 
school- to-school difference Ih achievement esqplained by individual home v 0 
background. This error letd to a serious overestimation of the possible 1).; 
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about the relationships between school resources and student achieve- 
ment, In particular, the Coleman data did not distinguish among trade, 
vocational, academic, and comprehensive high schools. 

Background Factors 

Measures of students' backgrounds bear a strong relationship to 
student achievement at all grade levels, both within and between schools. 
The two errors in the original analysis ~ leaving out two variables and 
underestimating the amount of between-school variance — led to a serious 
overestlmatlon of the effect of school factors on achievement in the 
Coleman Report, 

Student-body Effects 

The Coleman Report's estimates of the amount of achievement var- 
iance uniquely explained by student-body variables (numbers 9-13) are 
severely reduced when the intended background controls (Including the 
two variables erroneously omitted) are used. The reduction is between 
25 and 50 percent for whites and between 10 and 25 percent for blacks. 

Furthermore, Coleman et al , thought that they had included the var- 
iable proportion of students planning to attend college, which they In- 
terpreted as a measure of the aspirations in the student body. Instead, 
however, they entered the variable proportion of students In the college 
track . This measure is essentially a direct measure selection In that 
those schools with large proportions of stidents In the college track 
schools, whereas the schools with small such proportions 
are trade or vocational schools; 

Finally, Smith performs a regression analysis for Northern blacks 
6th, 9th, and 12th grades — six regressions in all. 

All background wd school resource variables are entered in each regres- 
sion, 1116 bMlc student-body variables 9-11 enter each equation. In 
addition, student-body variables 12 tmd 13 (14): en^ eat* 9th and 12th 
(6th) grade equation. In these twenty -eight cases (four student-body 
variables in each of two 6th grade equations , five student body varia- 
bles In each of ^o 9th and 12th grade equations) student-body varia- 
bles axe significant bhiy three times , And in two- of these cases —— 

■ ' 1 . 
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teacher perception of student-body quality (14) in both 6th grade re- 
gressions — the variable has the "wrong" sign (negative) . Proportion 
in college track is significant and positive in only the 12th grade. 
Northern black equation. 

In stamnary. Smith finds no evidence that characteristics of the 
student body have a strong Independent influence on the verbal achieve- 
ment of individual students . 

School Facilities and Curriculum 

Smith investigates the same eleven school facilities and curricu- 
lum variables (numbers 15-25) as did Coleman et al . He supports Coleman's 
original finding that the relationship between facilities and curricu- 
lum variables and student achievement is extremely slight. In the four 
full regressions. Including all independent variables, for Northern 
blacks and whites in the 9th and 12th grade, facilities and curriculum 
measures are slgnlflcatit in only three of 48 cases -— movement between 
tracks (-) for 9th grade blacks, comprehensive currlcultm (-) for 9th 
grade whites, and school size (-) for 12th grade blacks. 

Teacher's Characteristics 

The teacher variables (numbers 27-33) are found to bear little 
relationship to between-school variations in student achievement. This 
is consistent with the overall conclusion reached in the EEOS report. 

In the four full regressions — 9th and 12th grade Northern blacks and 
whites ~ no teacher characteristic appears to be significant in any 
regression. 




A SAMPLE OF EDUCATIONAL INTERVENTION WITH QUALITY RESEARCH DESIGNS 
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